Prerequisite · Matrix Algebra Foundations

Eigenvalues, Eigenvectors, and Decompositions

18 min read
0:00
Audio overview generated with
By the end of this reading you will be able to:
  • Find the eigenvalues of a matrix by solving the characteristic equation and compute corresponding eigenvectors
  • Apply the spectral theorem to write a symmetric matrix as A = Q Lambda Q^T and explain why this guarantees real eigenvalues and orthogonal eigenvectors
  • Interpret the SVD factors U, Sigma, V^T geometrically and identify what each does to an input vector
  • Explain PCA as eigendecomposition of the sample covariance matrix and state what the eigenvalues and eigenvectors represent
  • Distinguish positive definite from positive semi-definite matrices by their eigenvalues and identify which arises from covariance matrices

Eigenvalues and Eigenvectors

When a matrix AA multiplies most vectors, both the direction and magnitude change. An eigenvector is a special nonzero vector whose direction is preserved — it is only scaled:

Aq=λqA\mathbf{q} = \lambda \mathbf{q}

The scalar λ\lambda is the corresponding eigenvalue. The eigenvector gets stretched (λ>1|\lambda| > 1), compressed (λ<1|\lambda| < 1), reversed (λ<0\lambda < 0), or left unchanged (λ=1\lambda = 1) — but never rotated off its line.

Finding Eigenvalues: Characteristic Equation

Rearranging Aq=λqA\mathbf{q} = \lambda\mathbf{q} gives (AλI)q=0(A - \lambda I)\mathbf{q} = \mathbf{0}. For a nonzero solution q\mathbf{q} to exist, the matrix AλIA - \lambda I must be singular:

det(AλI)=0\det(A - \lambda I) = 0

This is the characteristic equation. For an n×nn \times n matrix it produces a degree-nn polynomial in λ\lambda — the characteristic polynomial — whose roots are the nn eigenvalues (counted with multiplicity, possibly complex).

Example: 2×2 matrix

For A=[3102]A = \begin{bmatrix} 3 & 1 \\ 0 & 2 \end{bmatrix}:

det(AλI)=(3λ)(2λ)0=λ25λ+6=(λ2)(λ3)=0\det(A - \lambda I) = (3-\lambda)(2-\lambda) - 0 = \lambda^2 - 5\lambda + 6 = (\lambda-2)(\lambda-3) = 0

Eigenvalues: λ1=2\lambda_1 = 2, λ2=3\lambda_2 = 3. For each, substitute back to find the eigenvector by solving (AλI)q=0(A - \lambda I)\mathbf{q} = \mathbf{0}.

Properties

  • tr(A)=iλi\text{tr}(A) = \sum_i \lambda_i — trace equals sum of eigenvalues
  • det(A)=iλi\det(A) = \prod_i \lambda_i — determinant equals product of eigenvalues
  • If λ=0\lambda = 0 is an eigenvalue, AA is singular (verifies the determinant connection)
  • Eigenvalues of A1A^{-1} are 1/λi1/\lambda_i
  • Eigenvalues of AkA^k are λik\lambda_i^k

Symmetric Matrices: Spectral Theorem

For symmetric matrices (A=AA = A'), the eigenstructure is especially clean. The spectral theorem guarantees:

  1. All eigenvalues are real
  2. Eigenvectors corresponding to distinct eigenvalues are orthogonal
  3. There exists a complete orthonormal basis of eigenvectors

This means any symmetric n×nn \times n matrix can be decomposed as:

A=QΛQA = Q \Lambda Q'

where QQ is an orthogonal matrix whose columns are the eigenvectors, and Λ=diag(λ1,,λn)\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n) is the diagonal matrix of eigenvalues. This is the eigendecomposition (or spectral decomposition).

Since QQ is orthogonal (QQ=IQ'Q = I), this can also be written:

A=i=1nλiqiqiA = \sum_{i=1}^n \lambda_i \mathbf{q}_i \mathbf{q}_i'

Each term λiqiqi\lambda_i \mathbf{q}_i \mathbf{q}_i' is a rank-1 matrix — a scaled outer product. The full matrix is a sum of nn rank-1 pieces along orthogonal directions.

Positive Definite Matrices

A symmetric matrix AA is positive definite (PD) if:

xAx>0for all nonzero x\mathbf{x}'A\mathbf{x} > 0 \quad \text{for all nonzero } \mathbf{x}

Equivalently: all eigenvalues are strictly positive. It is positive semi-definite (PSD) if xAx0\mathbf{x}'A\mathbf{x} \geq 0 (eigenvalues 0\geq 0).

Covariance matrices are always PSD (PD when the data has full rank). The 3DGS covariance Σ=RSSR\Sigma = RSS'R' is PSD by construction — SS is diagonal with non-negative entries, so SSSS' has non-negative eigenvalues, and orthogonal RR preserves the sign.

Singular Value Decomposition (SVD)

The eigendecomposition requires a square matrix. The singular value decomposition generalizes it to any n×Kn \times K matrix:

A=UΣVA = U \Sigma V'

where:

  • UU is n×nn \times n orthogonal — left singular vectors (columns span the column space of AA)
  • Σ\Sigma is n×Kn \times K diagonal — singular values σ1σ20\sigma_1 \geq \sigma_2 \geq \cdots \geq 0 on the diagonal
  • VV is K×KK \times K orthogonal — right singular vectors (columns span the row space of AA)

The singular values are the square roots of the eigenvalues of AAA'A (equivalently AAAA'). The number of nonzero singular values equals the rank of AA.

Truncated SVD and Low-Rank Approximation

Keeping only the top rr singular values gives the best rank-rr approximation of AA in the Frobenius norm:

Ar=i=1rσiuiviA_r = \sum_{i=1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i'

This is the mathematical foundation of PCA, latent semantic analysis, and matrix factorization methods. Throwing away small singular values discards the directions of least variance while retaining the most informative structure.

Application: PCA via Eigendecomposition

Given a data matrix XX (nn observations, KK features, mean-centered), the sample covariance matrix is:

S=1n1XXS = \frac{1}{n-1} X'X

SS is K×KK \times K, symmetric, and positive semi-definite. Its eigendecomposition S=QΛQS = Q\Lambda Q' gives:

  • Principal components: the eigenvectors q1,,qK\mathbf{q}_1, \ldots, \mathbf{q}_K (columns of QQ) — orthogonal directions of maximum variance
  • Explained variance: the eigenvalue λi\lambda_i is the variance of the data projected onto qi\mathbf{q}_i
  • Projection: Z=XQrZ = XQ_r (where QrQ_r keeps the top rr eigenvectors) gives the rr-dimensional PCA embedding

In 3DGS, each Gaussian's scale parameters (sx,sy,sz)(s_x, s_y, s_z) define the square roots of the eigenvalues of the Gaussian's covariance — they directly encode the variance along the three principal axes of the ellipsoid.

Cholesky Decomposition

For a positive definite matrix AA, the Cholesky decomposition is:

A=LLA = LL'

where LL is a lower triangular matrix with positive diagonal entries. It is the matrix analogue of the square root. Cholesky is numerically preferred over full eigendecomposition when you only need a factored form — it runs in O(n3/3)O(n^3/3) rather than O(n3)O(n^3) and is more numerically stable. It is used in multivariate normal sampling and as a preconditioner in optimization.

References
Strang 2016 — Introduction to Linear Algebra, 5th ed., Ch. 6–7
Greene 2003 — Econometric Analysis, 5th ed., Appendix A.6–A.8