SVD, PCA, and the Geometry of Neural Representations
sklearn.decomposition.PCA
torch.linalg.svd for decomposition; in TensorFlow: use tf.linalg.svd — compare outputs and timing
Lab Overview
The singular value decomposition is arguably the most important computational tool in applied linear algebra — and it sits at the heart of PCA, matrix factorisation, dimensionality reduction, and the emerging theory of why large language models generalise. In this capstone lab you will compute SVDs on real data, perform low-rank approximation, verify the SVD–PCA equivalence, and analyse the spectral structure of a pre-trained embedding matrix.
What You Will Build
A notebook that (1) reconstructs GloVe embeddings at varying ranks and measures Frobenius-norm error, (2) implements PCA from scratch via SVD and verifies it matches sklearn, (3) plots the singular value spectrum and identifies the effective rank, and (4) probes implicit low-rank structure in a pre-trained embedding matrix — connecting the theory of gradient descent to empirical observations in modern LLMs.