Prerequisite · Linear Algebra

Eigenvalues, Diagonalization, and Jordan Form

19 min read
By the end of this reading you will be able to:
  • Find eigenvalues by solving the characteristic equation and compute corresponding eigenvectors
  • Diagonalize a matrix as A = P Lambda P-inv when sufficient linearly independent eigenvectors exist
  • Apply eigendecomposition to compute matrix powers A^k = P Lambda^k P-inv efficiently
  • Explain Jordan canonical form and identify when a matrix is not diagonalisable
  • Connect the eigenspectrum to convergence of gradient descent and power iteration: faster convergence when the spectral gap is large
  • Relate eigendecomposition of A^T A to the SVD and explain why singular values are non-negative square roots of eigenvalues

Similarity and Diagonal Representations

A transformation t:VVt: V \to V can look complicated in one basis and simple in another. The best possible representation is diagonal — where the matrix has nonzero entries only on the main diagonal. A diagonal transformation just scales each basis vector independently:

t(βi)=λiβit(\vec{\beta}_i) = \lambda_i \vec{\beta}_i

When can we find such a basis? The vectors βi\vec{\beta}_i must be sent to scalar multiples of themselves — they are eigenvectors, and λi\lambda_i are eigenvalues.

Eigenvalues and Eigenvectors

For a transformation t:VVt: V \to V, a nonzero vector v\vec{v} is an eigenvector with eigenvalue λ\lambda if:

t(v)=λvt(\vec{v}) = \lambda\vec{v}

In matrix form: Av=λvA\vec{v} = \lambda\vec{v}, or equivalently (AλI)v=0(A - \lambda I)\vec{v} = \vec{0}. For a nonzero v\vec{v} to exist, AλIA - \lambda I must be singular:

det(AλI)=0\det(A - \lambda I) = 0

This is the characteristic equation. Expanding the determinant gives the characteristic polynomial p(λ)p(\lambda) — a degree-nn polynomial whose roots are the eigenvalues.

Finding Eigenvalues and Eigenvectors

  1. Compute p(λ)=det(AλI)p(\lambda) = \det(A - \lambda I) and find its roots λ1,,λk\lambda_1, \ldots, \lambda_k
  2. For each eigenvalue λi\lambda_i, solve (AλiI)v=0(A - \lambda_i I)\vec{v} = \vec{0} by row reduction. The null space N(AλiI)\mathcal{N}(A - \lambda_i I) is the eigenspace of λi\lambda_i — all eigenvectors for that eigenvalue plus 0\vec{0}.

Properties

  • tr(A)=iλi\text{tr}(A) = \sum_i \lambda_i (sum of eigenvalues = trace)
  • det(A)=iλi\det(A) = \prod_i \lambda_i (product of eigenvalues = determinant)
  • Eigenvectors for distinct eigenvalues are linearly independent
  • Real symmetric matrices always have real eigenvalues and orthogonal eigenvectors

Diagonalizability

An n×nn \times n matrix AA is diagonalizable if it is similar to a diagonal matrix:

A=PΛP1A = P\Lambda P^{-1}

where Λ=diag(λ1,,λn)\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n) and the columns of PP are the corresponding eigenvectors.

When is AA diagonalizable?

  • AA has nn linearly independent eigenvectors (equivalently: the eigenvectors span Rn\mathbb{R}^n)
  • AA has nn distinct eigenvalues → automatically diagonalizable
  • For repeated eigenvalues: AA is diagonalizable iff the geometric multiplicity (dimension of the eigenspace) equals the algebraic multiplicity (multiplicity of the root in p(λ)p(\lambda)) for every eigenvalue

Powers and Functions of a Matrix

Diagonalization makes computing powers trivial:

Ak=PΛkP1,Λk=diag(λ1k,,λnk)A^k = P\Lambda^k P^{-1}, \qquad \Lambda^k = \text{diag}(\lambda_1^k, \ldots, \lambda_n^k)

Matrix exponential: eA=Pdiag(eλ1,,eλn)P1e^A = P\,\text{diag}(e^{\lambda_1}, \ldots, e^{\lambda_n})\,P^{-1}. This underlies the solution of linear differential equations x˙=Ax\dot{\vec{x}} = A\vec{x}.

The Jordan Canonical Form

Not all matrices are diagonalizable. The next best thing is the Jordan canonical form — a block-diagonal matrix where each block corresponds to one eigenvalue:

J=[Jn1(λ1)Jnk(λk)]J = \begin{bmatrix} J_{n_1}(\lambda_1) & & \\ & \ddots & \\ & & J_{n_k}(\lambda_k) \end{bmatrix}

A Jordan block Jm(λ)J_m(\lambda) of size mm is:

Jm(λ)=[λ1000λ10000λ10000λ]J_m(\lambda) = \begin{bmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ & & \ddots & \ddots & \\ 0 & 0 & 0 & \lambda & 1 \\ 0 & 0 & 0 & 0 & \lambda \end{bmatrix}

Diagonal blocks of size 1 are just the eigenvalue — a diagonalizable matrix has all 1×11 \times 1 Jordan blocks. A block of size >1> 1 arises when there are fewer independent eigenvectors than the algebraic multiplicity of the eigenvalue.

Theorem: every square matrix over C\mathbb{C} (or over R\mathbb{R} if we allow complex eigenvectors) is similar to a Jordan canonical form. The Jordan form is unique up to reordering of blocks.

Nilpotent Matrices

A matrix NN is nilpotent if Nk=0N^k = 0 for some kk. The only eigenvalue of a nilpotent matrix is 0. Every Jordan block with eigenvalue 0 is a nilpotent block; any nilpotent matrix is similar to a block of such blocks.

The Jordan form decomposes any transformation as:

A=D+N,DN=NDA = D + N, \quad DN = ND

where DD is diagonalizable (semisimple part) and NN is nilpotent. This is the Jordan-Chevalley decomposition — the foundation of the theory of matrix functions and linear ODEs.

Connection to Spectral Theory in ML

For symmetric matrices (which covariance matrices always are), the Jordan form is always diagonal with real entries — this is the spectral theorem. The eigendecomposition A=QΛQA = Q\Lambda Q' always exists, with QQ orthogonal.

This is why PCA is tractable: the covariance matrix XX/(n1)X'X/(n-1) is symmetric positive semi-definite, so it always diagonalizes with non-negative real eigenvalues and orthogonal eigenvectors. The Jordan form (with its potential for non-diagonal blocks) never arises in practice for covariance matrices.

References
Hefferon 2020 — Linear Algebra, Ch. Five: Similarity, Eigenvalues, Jordan Form