Eigenvalues, Diagonalization, and Jordan Form
- Find eigenvalues by solving the characteristic equation and compute corresponding eigenvectors
- Diagonalize a matrix as A = P Lambda P-inv when sufficient linearly independent eigenvectors exist
- Apply eigendecomposition to compute matrix powers A^k = P Lambda^k P-inv efficiently
- Explain Jordan canonical form and identify when a matrix is not diagonalisable
- Connect the eigenspectrum to convergence of gradient descent and power iteration: faster convergence when the spectral gap is large
- Relate eigendecomposition of A^T A to the SVD and explain why singular values are non-negative square roots of eigenvalues
Similarity and Diagonal Representations
A transformation can look complicated in one basis and simple in another. The best possible representation is diagonal — where the matrix has nonzero entries only on the main diagonal. A diagonal transformation just scales each basis vector independently:
When can we find such a basis? The vectors must be sent to scalar multiples of themselves — they are eigenvectors, and are eigenvalues.
Eigenvalues and Eigenvectors
For a transformation , a nonzero vector is an eigenvector with eigenvalue if:
In matrix form: , or equivalently . For a nonzero to exist, must be singular:
This is the characteristic equation. Expanding the determinant gives the characteristic polynomial — a degree- polynomial whose roots are the eigenvalues.
Finding Eigenvalues and Eigenvectors
- Compute and find its roots
- For each eigenvalue , solve by row reduction. The null space is the eigenspace of — all eigenvectors for that eigenvalue plus .
Properties
- (sum of eigenvalues = trace)
- (product of eigenvalues = determinant)
- Eigenvectors for distinct eigenvalues are linearly independent
- Real symmetric matrices always have real eigenvalues and orthogonal eigenvectors
Diagonalizability
An matrix is diagonalizable if it is similar to a diagonal matrix:
where and the columns of are the corresponding eigenvectors.
When is diagonalizable?
- has linearly independent eigenvectors (equivalently: the eigenvectors span )
- has distinct eigenvalues → automatically diagonalizable
- For repeated eigenvalues: is diagonalizable iff the geometric multiplicity (dimension of the eigenspace) equals the algebraic multiplicity (multiplicity of the root in ) for every eigenvalue
Powers and Functions of a Matrix
Diagonalization makes computing powers trivial:
Matrix exponential: . This underlies the solution of linear differential equations .
The Jordan Canonical Form
Not all matrices are diagonalizable. The next best thing is the Jordan canonical form — a block-diagonal matrix where each block corresponds to one eigenvalue:
A Jordan block of size is:
Diagonal blocks of size 1 are just the eigenvalue — a diagonalizable matrix has all Jordan blocks. A block of size arises when there are fewer independent eigenvectors than the algebraic multiplicity of the eigenvalue.
Theorem: every square matrix over (or over if we allow complex eigenvectors) is similar to a Jordan canonical form. The Jordan form is unique up to reordering of blocks.
Nilpotent Matrices
A matrix is nilpotent if for some . The only eigenvalue of a nilpotent matrix is 0. Every Jordan block with eigenvalue 0 is a nilpotent block; any nilpotent matrix is similar to a block of such blocks.
The Jordan form decomposes any transformation as:
where is diagonalizable (semisimple part) and is nilpotent. This is the Jordan-Chevalley decomposition — the foundation of the theory of matrix functions and linear ODEs.
Connection to Spectral Theory in ML
For symmetric matrices (which covariance matrices always are), the Jordan form is always diagonal with real entries — this is the spectral theorem. The eigendecomposition always exists, with orthogonal.
This is why PCA is tractable: the covariance matrix is symmetric positive semi-definite, so it always diagonalizes with non-negative real eigenvalues and orthogonal eigenvectors. The Jordan form (with its potential for non-diagonal blocks) never arises in practice for covariance matrices.