Prerequisite · Matrix Algebra Foundations

Einsum in PyTorch

Colab Notebook · ~45 min
Google Colab Notebook
Einsum in PyTorch
Python · ~45 min
Open in Colab
Lab Objectives
1
Translate standard matrix operations (transpose, trace, inner product, outer product, matrix multiply) into torch.einsum strings and verify correctness numerically.
2
Write einsum strings for batched operations and confirm they match torch.bmm output.
3
Express the quadratic form xAx\mathbf{x}^\top A \mathbf{x} and sample covariance XX/(n1)X^\top X / (n-1) as single einsum calls.
4
Implement the scaled dot-product attention score and weighted-sum output using einsum.
5
Construct batch 3D covariance matrices Σ=Rdiag(s2)R\Sigma = R\,\text{diag}(s^2)\,R^\top via einsum and verify positive semi-definiteness.
6
Benchmark einsum vs explicit torch ops and reason about when to prefer each.

Lab: Einstein Summation in PyTorch

torch.einsum is a single function that can express virtually any multilinear operation on tensors — dot products, matrix multiplies, transposes, traces, outer products, and batched variants — using a compact notation borrowed directly from physics and mathematics.

This lab builds your fluency with the notation from the ground up, then shows how the same patterns appear in modern ML architectures.

What You'll Build

  • A reference sheet of the 5 canonical einsum families: unary, binary, batched, quadratic, and attention
  • Verified implementations of every standard matrix operation from the prereq readings, translated into einsum strings
  • A from-scratch scaled dot-product attention forward pass using only torch.einsum
  • A batch covariance routine that mirrors the 3DGS covariance construction Σ=RSSR\Sigma = RSS^\top R^\top
  • A performance benchmark comparing einsum against torch.bmm and torch.matmul

Key Concepts Practiced

After this lab you will be able to read any einsum string in a research codebase and immediately parse which axes are being contracted, which are being broadcast, and what the output shape will be — and write your own for novel tensor operations without resorting to explicit reshape/transpose/matmul chains.