Prerequisite · Matrix Algebra Foundations

Determinants and Matrix Rank

13 min read
0:00
Audio overview generated with
By the end of this reading you will be able to:
  • Compute the determinant of a 2x2 and 3x3 matrix using the cofactor expansion
  • Interpret the determinant as a signed volume scaling factor and use it to determine whether a matrix is invertible
  • Compute the rank of a matrix via row reduction and apply the rank-nullity theorem
  • Explain the trace, its cyclic property tr(ABC) = tr(CAB), and its connection to the eigenvalue sum

The Determinant

The determinant of a square matrix AA, written det(A)\det(A) or A|A|, is a scalar that encodes whether AA is invertible and by how much it stretches or compresses space.

2×2 Case

For a 2×22 \times 2 matrix:

det[abcd]=adbc\det\begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc

Geometrically: if the two rows are interpreted as vectors in R2\mathbb{R}^2, the determinant equals the signed area of the parallelogram they span. If the rows are parallel (linearly dependent), the parallelogram collapses to a line — area zero, determinant zero.

3×3 Case (Cofactor Expansion)

For a 3×33 \times 3 matrix, expand along the first row:

det(A)=a11(a22a33a23a32)a12(a21a33a23a31)+a13(a21a32a22a31)\det(A) = a_{11}(a_{22}a_{33} - a_{23}a_{32}) - a_{12}(a_{21}a_{33} - a_{23}a_{31}) + a_{13}(a_{21}a_{32} - a_{22}a_{31})

This pattern — alternating signs, each term multiplying a row element by the determinant of the submatrix formed by deleting that element's row and column — is cofactor expansion and generalizes to any n×nn \times n matrix.

Properties

  • det(I)=1\det(I) = 1
  • det(A)=det(A)\det(A') = \det(A)
  • det(AB)=det(A)det(B)\det(AB) = \det(A)\det(B)
  • det(cA)=cndet(A)\det(cA) = c^n \det(A) for an n×nn \times n matrix
  • Swapping two rows negates the determinant
  • Adding a multiple of one row to another leaves the determinant unchanged
  • A matrix with two identical rows has determinant zero

Geometric interpretation in nn dimensions

For an n×nn \times n matrix AA, det(A)|\det(A)| is the nn-dimensional volume of the parallelepiped formed by the rows (or columns) of AA. The sign encodes orientation. A linear transformation xAx\mathbf{x} \mapsto A\mathbf{x} scales all volumes by det(A)|\det(A)|.

Invertibility and the Determinant

A square matrix AA is invertible (also called non-singular) if and only if det(A)0\det(A) \neq 0.

When det(A)=0\det(A) = 0:

  • AA is singular — it cannot be inverted
  • The columns of AA are linearly dependent
  • The transformation AxA\mathbf{x} maps all of Rn\mathbb{R}^n into a lower-dimensional subspace
  • The system Ax=bA\mathbf{x} = \mathbf{b} either has no solution or infinitely many

In ML: a singular covariance matrix signals that your data lives in a lower-dimensional subspace than assumed — some features are exact linear combinations of others.

Matrix Rank

The rank of a matrix AA, written rank(A)\text{rank}(A), is the dimension of its column space — equivalently, the number of linearly independent columns (which always equals the number of linearly independent rows).

For an n×Kn \times K matrix:

  • rank(A)min(n,K)\text{rank}(A) \leq \min(n, K)
  • If rank(A)=min(n,K)\text{rank}(A) = \min(n, K), the matrix has full rank
  • For a square n×nn \times n matrix: AA is invertible     rank(A)=n\iff \text{rank}(A) = n

Rank-Nullity Theorem

For an n×Kn \times K matrix AA:

rank(A)+nullity(A)=K\text{rank}(A) + \text{nullity}(A) = K

where nullity(A)=dim(N(A))\text{nullity}(A) = \dim(\mathcal{N}(A)) is the dimension of the null space. Every column "direction" either contributes to the output (rank) or gets killed to zero (null space) — the two together always account for all KK input dimensions.

Idempotent Matrices

A matrix MM is idempotent if M2=MM=MM^2 = MM = M. Idempotent matrices represent projections: applying the transformation twice gives the same result as applying it once, because the output is already in the target subspace.

The mean-deviation matrix

M0=I1niiM^0 = I - \frac{1}{n}\mathbf{i}\mathbf{i}'

(where i\mathbf{i} is the n×1n\times 1 vector of ones) is a symmetric idempotent matrix. Pre-multiplying a data vector x\mathbf{x} by M0M^0 produces the mean-deviation form xxˉi\mathbf{x} - \bar{x}\mathbf{i}. This matrix appears throughout regression and ANOVA.

For any symmetric idempotent matrix MM:

  • rank(M)=tr(M)\text{rank}(M) = \text{tr}(M) (trace equals rank)
  • Its eigenvalues are all 0 or 1
  • It represents an orthogonal projection onto its column space

Trace

The trace of a square matrix is the sum of its diagonal elements:

tr(A)=i=1naii\text{tr}(A) = \sum_{i=1}^n a_{ii}

Key properties:

  • tr(A+B)=tr(A)+tr(B)\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)
  • tr(AB)=tr(BA)\text{tr}(AB) = \text{tr}(BA) (cyclic property — even when ABBAAB \neq BA)
  • tr(A)=iλi\text{tr}(A) = \sum_i \lambda_i where λi\lambda_i are the eigenvalues of AA

The cyclic property makes trace useful for simplifying quadratic forms: xAx=tr(Axx)\mathbf{x}'A\mathbf{x} = \text{tr}(A\mathbf{x}\mathbf{x}'), which can be easier to differentiate.

References
Strang 2016 — Introduction to Linear Algebra, 5th ed., Ch. 4–5
Greene 2003 — Econometric Analysis, 5th ed., Appendix A.4–A.5