3D Gaussian Splatting · Foundations of 3DGS

Gaussian Mathematics: Attributes, Covariance, and Spherical Harmonics

15 min read
By the end of this reading you will be able to:
  • List the 59 parameters of a single 3D Gaussian by category (position, rotation, scale, opacity, SH coefficients) and identify which category dominates the storage budget
  • Derive the covariance factorization Σ = RSS⊤R⊤ and explain why this parameterization guarantees positive semi-definiteness throughout gradient-based optimization
  • Explain how spherical harmonics decompose view-dependent color into basis functions, identify which SH orders encode the DC color vs. view-dependent highlights, and state why the AC coefficients dominate the parameter budget
  • Compute the total uncompressed scene size in MB given a Gaussian count and float32 parameter budget, and identify which attribute categories offer the most leverage for reduction

The 59-Parameter Gaussian

Every 3D Gaussian splat carries a fixed set of attributes. Understanding what each encodes and how many parameters it contributes is essential for any compression discussion.

The full parameter budget per Gaussian:

Attribute Parameters Purpose
Position μ\mu 3 Location in world space (x,y,z)(x,y,z)
Scale ss 3 Anisotropic extent along each axis
Rotation qq 4 Quaternion encoding orientation
Opacity σα\sigma_\alpha 1 Base transparency (pre-sigmoid)
DC color (SH order 0) 3 View-independent RGB color
SH coefficients (orders 1–3) 45 View-dependent color variation
Total 59

At float32 precision: 59 × 4 = 236 bytes per Gaussian. A typical scene with 5 million Gaussians is ~1.2 GB.

Covariance Factorization: Why Σ=RSSR\Sigma = RSS^\top R^\top

A covariance matrix must be positive semi-definite (PSD) — this is a mathematical requirement for a valid Gaussian. Storing and optimizing a full 3×3 symmetric matrix is possible, but gradient descent will almost certainly produce non-PSD matrices at intermediate steps, breaking the Gaussian interpretation.

3DGS avoids this by factoring:

Σ=RSSR\Sigma = R S S^\top R^\top

where:

  • RSO(3)R \in SO(3) is a rotation matrix (encoded as a quaternion qR4q \in \mathbb{R}^4, q=1\|q\|=1)
  • S=diag(s1,s2,s3)S = \text{diag}(s_1, s_2, s_3) is a diagonal scale matrix with si>0s_i > 0

Why is Σ\Sigma always PSD? For any vector vv: vΣv=vRSSRv=SRv20v^\top \Sigma v = v^\top R S S^\top R^\top v = \|S^\top R^\top v\|^2 \geq 0

This is always non-negative by construction. The optimization landscape is unconstrained — you can apply gradient descent to qq and logsi\log s_i (log to keep scale positive) without special projection steps.

The geometric interpretation: SS stretches a unit sphere along three principal axes, and RR rotates the result. The Gaussian ellipsoid is the image of this transformation.

Spherical Harmonics for View-Dependent Color

Real surfaces exhibit view-dependent appearance: specular highlights, reflections, refractions, and subsurface scattering all change color based on the viewing direction. A single RGB color per Gaussian is insufficient.

Spherical harmonics (SH) are an orthonormal basis of functions on the unit sphere. They decompose a directional function f(θ,ϕ)f(\theta, \phi) into frequency bands:

f(θ,ϕ)=l=0Lm=llclmYlm(θ,ϕ)f(\theta,\phi) = \sum_{l=0}^{L} \sum_{m=-l}^{l} c_l^m Y_l^m(\theta,\phi)

where YlmY_l^m are the SH basis functions and clmc_l^m are learnable coefficients.

The number of coefficients per band:

  • Order 0 (DC): (0+1)2=1(0+1)^2 = 1 coefficient → view-independent color
  • Order 1: (1+1)2=4(1+1)^2 = 4 total → 3 new coefficients (linear variation)
  • Order 2: 9 total → 5 new coefficients
  • Order 3: (3+1)2=16(3+1)^2 = 16 total → 7 new coefficients

Since 3DGS uses order 3 and stores one SH expansion per color channel, that's 16×3=4816 \times 3 = 48 SH parameters. Subtracting the DC term (already counted under "DC color"): 45 SH AC coefficients.

Why SH Dominates the Bit Budget

The 45 SH coefficients represent 45/5976%45/59 \approx 76\% of all parameters per Gaussian. This immediately suggests SH degree reduction as the highest-leverage compaction strategy — a topic we will explore in Module 3.

The trade-off: dropping from order 3 to order 0 (DC only) halves the parameter count but eliminates all view-dependent effects. Intermediate degrees offer a smooth quality–size curve.

The Gaussian Density Function in Context

Putting it together, the full density contribution of Gaussian ii at point p\mathbf{p} (a multivariate Gaussian density):

fi(p)=σαexp ⁣(12(pμi)Σi1(pμi))f_i(\mathbf{p}) = \sigma_\alpha \exp\!\left(-\frac{1}{2}(\mathbf{p}-\mu_i)^\top \Sigma_i^{-1}(\mathbf{p}-\mu_i)\right)

The color seen along direction d\mathbf{d} is then:

ci(d)=sigmoid ⁣(l,mclmYlm(d))per channelc_i(\mathbf{d}) = \text{sigmoid}\!\left(\sum_{l,m} c_l^m Y_l^m(\mathbf{d})\right)_{\text{per channel}}

The sigmoid maps the linear SH sum into [0,1][0,1] for each of the three RGB channels.