Prerequisite · Linear Algebra

Linear Independence and Bases

16 min read
By the end of this reading you will be able to:
  • Test a set of vectors for linear independence by solving the homogeneous equation
  • Construct a basis for a subspace from a spanning set by removing dependent vectors via row reduction
  • Compute the coordinate representation of a vector in a non-standard basis
  • Explain why word and token embeddings implicitly define coordinate systems and what linear independence means in that context

Linear Independence

A set of vectors {v1,,vk}\{\vec{v}_1, \ldots, \vec{v}_k\} from a vector space VV is linearly independent if the only way to write 0\vec{0} as a linear combination is the trivial one:

c1v1+c2v2++ckvk=0    c1=c2==ck=0c_1 \vec{v}_1 + c_2 \vec{v}_2 + \cdots + c_k \vec{v}_k = \vec{0} \implies c_1 = c_2 = \cdots = c_k = 0

If any non-trivial combination exists, the set is linearly dependent — at least one vector is redundant (it lies in the span of the others).

Testing for Independence

Form the matrix AA whose columns are v1,,vk\vec{v}_1, \ldots, \vec{v}_k and row-reduce [A0][A \mid \vec{0}]. The set is independent iff the only solution is c=0\vec{c} = \vec{0}, i.e., iff every column contains a pivot (no free variables).

Dependency Relations

If the set is dependent, the reduction reveals the dependency: any free variable corresponds to a non-trivial combination equaling 0\vec{0}. This tells you exactly which vector is redundant and how to express it.

Example: are v1=(1,2,0)\vec{v}_1 = (1,2,0)', v2=(0,1,1)\vec{v}_2 = (0,1,1)', v3=(1,4,2)\vec{v}_3 = (1,4,2)' independent?

[101021400120][101001200000]\left[\begin{array}{rrr|r}1&0&1&0\\2&1&4&0\\0&1&2&0\end{array}\right] \to \left[\begin{array}{rrr|r}1&0&1&0\\0&1&2&0\\0&0&0&0\end{array}\right]

Free variable c3c_3: set c3=1c_3 = 1 to get c2=2c_2 = -2, c1=1c_1 = -1. So v3=v1+2v2\vec{v}_3 = \vec{v}_1 + 2\vec{v}_2 — the set is dependent.

Basis

A basis for a vector space VV is a set of vectors B={β1,,βn}B = \{\vec{\beta}_1, \ldots, \vec{\beta}_n\} that simultaneously:

  1. Spans VV: every vector in VV is a linear combination of the βi\vec{\beta}_i
  2. Is linearly independent: no vector is redundant

A basis is the minimal spanning set — remove any vector and it no longer spans. It is also the maximal independent set — add any vector and dependence appears.

Existence of Bases

Every vector space has a basis (this requires the Axiom of Choice for infinite-dimensional spaces, but is elementary for finite-dimensional ones). Starting from any spanning set, you can build a basis by removing redundant vectors one at a time until independence is achieved.

The Standard Basis for Rn\mathbb{R}^n

The standard basis En={e1,,en}\mathcal{E}_n = \{\vec{e}_1, \ldots, \vec{e}_n\} has ei\vec{e}_i equal to the column of the identity matrix InI_n with a 11 in position ii:

e1=(100),e2=(010),,en=(001)\vec{e}_1 = \begin{pmatrix}1\\0\\\vdots\\0\end{pmatrix}, \quad \vec{e}_2 = \begin{pmatrix}0\\1\\\vdots\\0\end{pmatrix}, \quad \ldots, \quad \vec{e}_n = \begin{pmatrix}0\\0\\\vdots\\1\end{pmatrix}

Any vector v=(v1,,vn)\vec{v} = (v_1, \ldots, v_n)' is uniquely expressed as v1e1++vnenv_1\vec{e}_1 + \cdots + v_n\vec{e}_n.

Other Bases for P2\mathcal{P}_2

The space of degree-2\leq 2 polynomials has basis {1,x,x2}\{1, x, x^2\} (standard) or equally {1,1+x,1+x+x2}\{1, 1+x, 1+x+x^2\} — any three independent polynomials that span P2\mathcal{P}_2.

Unique Representation

The most important property of a basis is uniqueness of representation:

Theorem: if B={β1,,βn}B = \{\vec{\beta}_1, \ldots, \vec{\beta}_n\} is a basis for VV, then every vV\vec{v} \in V can be written as c1β1++cnβnc_1\vec{\beta}_1 + \cdots + c_n\vec{\beta}_n in exactly one way.

Proof: suppose v=ciβi=diβi\vec{v} = \sum c_i\vec{\beta}_i = \sum d_i\vec{\beta}_i. Subtracting: (cidi)βi=0\sum (c_i - d_i)\vec{\beta}_i = \vec{0}. Independence forces ci=dic_i = d_i for all ii.

The coefficients (c1,,cn)(c_1, \ldots, c_n) are the coordinates of v\vec{v} with respect to basis BB, written RepB(v)\mathrm{Rep}_B(\vec{v}). Choosing a basis is choosing a coordinate system.

Why Bases Matter for ML

In machine learning, features are basis representations: the columns of a data matrix form (implicitly) a basis for the feature space. PCA changes basis to one aligned with the directions of maximum variance. Neural network weight matrices change basis between layers. Understanding what a basis is — a minimal, non-redundant, complete coordinate system — makes all of these operations transparent.

Testing Independence and Computing Representations

import numpy as np

# Columns of A are v1, v2, v3
A = np.array([[1., 0., 1.],
              [2., 1., 4.],
              [0., 1., 2.]])

# Rank via SVD: independent iff rank equals number of columns
_, S, Vh = np.linalg.svd(A, full_matrices=True)
rank    = int(np.sum(S > 1e-9))
nullity = A.shape[1] - rank
print('rank:', rank)        # 2  (< 3 → dependent)
print('nullity:', nullity)  # 1

# Null vector reveals the dependency: c1*v1 + c2*v2 + c3*v3 = 0
null_vec = Vh[-1]             # last row of Vh (smallest singular value)
c = null_vec / null_vec[-1]   # scale so c3 = 1
print('dependency (c1,c2,c3):', np.round(c, 4))  # [-1. -2.  1.] → v3 = v1 + 2*v2

# Coordinate representation Rep_B(v): solve B @ coords = v
B = np.array([[1., 0., 0.],
              [1., 1., 0.],
              [0., 1., 1.]])
v = np.array([2., 3., 1.])
coords = np.linalg.solve(B, v)
print('Rep_B(v):', np.round(coords, 4))
print('verify:  ', np.round(np.einsum('ij,j->i', B, coords), 4))

The einsum string 'ij,j->i' is the matrix–vector product AxA\vec{x}, identical in NumPy, PyTorch (torch.einsum), and TensorFlow (tf.einsum). The SVD-based null vector is in an orthonormal basis; c = null_vec / null_vec[-1] rescales it to the free-variable parameterization from the row-reduction.

References
Hefferon 2020 — Linear Algebra, Ch. Two §II–III: Linear Independence, Basis