3D Gaussian Splatting · Foundations of 3DGS

Rendering Pipeline and Differentiable Training

14 min read

By the end of this reading you will be able to:

Trace the 3DGS rendering pipeline from 3D Gaussians to final pixel colors, identifying the projection (Σ′ = JWΣW⊤J⊤), tile assignment, depth sort, and alpha compositing steps
Explain why per-tile depth sorting is an approximation, identify the conditions under which it produces visible artifacts, and state why it is used despite this limitation
Distinguish cloning from splitting in adaptive densification and identify the specific positional-gradient signal and Gaussian scale condition that triggers each operation
Explain why the photometric loss combines L1 and D-SSIM terms, identify what artifact each term is sensitive to, and state the role of opacity resets in preventing opacity fog

From 3D Gaussians to a 2D Image

The 3DGS rendering pipeline is a carefully engineered rasterizer that trades the exactness of ray marching for real-time performance. Understanding it precisely matters because the rendering pipeline is also the backward pass through which gradients flow during training.

Step 1: Projection — 3D Gaussians to 2D Splats

A 3D Gaussian with mean $\mu$ and covariance $\Sigma$ projects to a 2D Gaussian on the image plane. Given viewing transform $W$ and Jacobian of the projection $J$ :

$\Sigma' = J W \Sigma W^\top J^\top$

The 3D ellipsoid becomes a 2D ellipse ("splat") in screen space. The projected mean $\mu'$ is simply the perspective projection of $\mu$ .

Step 2: Tile-Based Rasterization

The screen is divided into a grid of 16×16 pixel tiles. For each Gaussian, 3DGS determines which tiles its 2D extent intersects and creates a list entry per tile.

Key steps:

Sorting: within each tile, all contributing Gaussians are sorted by depth (front to back) using a fast radix sort.
Alpha accumulation: each tile is processed in parallel on GPU with one thread per pixel, accumulating color front-to-back.

Sorting is per-tile, not globally, which is an approximation — but in practice it produces minimal artifacts while enabling orders-of-magnitude speedup over exact sorting.

Step 3: Alpha Compositing

The rendered color at pixel $\mathbf{p}$ is the alpha-composited sum over all $N$ depth-sorted Gaussians visible in that pixel:

$c(\mathbf{p}) = \sum_{i=1}^{N} c_i\, \alpha_i'(\mathbf{p}) \prod_{j=1}^{i-1} \bigl(1 - \alpha_j'(\mathbf{p})\bigr)$

where:

$c_i$ is the view-dependent color of Gaussian $i$ (evaluated via SH)
$\alpha_i'(\mathbf{p}) = \sigma_{\alpha,i} \cdot f_i'(\mathbf{p})$ is the effective alpha at pixel $\mathbf{p}$ , combining opacity and the 2D Gaussian density
$\prod_{j<i}(1-\alpha_j')$ is the accumulated transmittance — how much light reaches Gaussian $i$ after the Gaussians in front absorb/occlude it

Early termination: once the accumulated transmittance drops below a threshold (typically 0.0001), all remaining Gaussians are skipped. This is what makes deep scenes tractable: most rays terminate well before reaching all Gaussians.

Training: Differentiable Rendering + Gradient Descent

All operations above are differentiable with respect to the Gaussian parameters. Training proceeds in three phases:

1. Initialize Gaussian positions from the SfM sparse point cloud.

2. Iterate — at each step, four operations form a tight inner loop:

① Render

Rasterize from a training viewpoint

② Loss

\mathcal{L} = (1{-}\lambda)\mathcal{L}_1 + \lambda\mathcal{L}_{D\text{-}SSIM}

③ Backprop

Differentiate through the rasterizer

④ Update

Adam step on

\mu, \Sigma, \sigma_\alpha, \text{SH}

↺ repeat for each training iteration

Render — rasterize the current Gaussians from a randomly sampled training viewpoint using the tile rasterizer.
Compute loss — the photometric objective combines a per-pixel absolute error term with a structural similarity term:

$\mathcal{L} = (1-\lambda)\mathcal{L}_1 + \lambda\mathcal{L}_{D\text{-}SSIM}$

$\mathcal{L}_1$ penalizes mean absolute color error per pixel. $\mathcal{L}_{D\text{-}SSIM}$ penalizes structural and contrast differences that $\mathcal{L}_1$ alone is insensitive to — blurring and edge misalignment both raise it even when the average color is correct. Typical $\lambda = 0.2$ .

Backpropagate — differentiate through the differentiable rasterizer to obtain $\partial\mathcal{L}/\partial\mu$ , $\partial\mathcal{L}/\partial\Sigma$ , $\partial\mathcal{L}/\partial\sigma_\alpha$ , and $\partial\mathcal{L}/\partial\mathbf{f}_{SH}$ for every visible Gaussian.
Update — take an Adam step on all Gaussian parameters: $\mu, \Sigma, \sigma_\alpha, \text{SH}$ .

3. Every $N$ iterations: run adaptive densification (described below).

Adaptive Densification

Gradient descent alone changes Gaussian attributes but not their count. Densification periodically adjusts the population:

What is a high positional gradient? During training, the photometric loss $\mathcal{L}$ is compared against each training image. The gradient $\partial\mathcal{L}/\partial\mu$ measures how much the loss would decrease if a Gaussian moved. A large value signals that the Gaussian is in the wrong place — the scene has structure nearby that it is failing to cover.

Positional Gradient

The amber region shows unrecovered photometric signal. The arrow is ∂L/∂μ — pointing from the Gaussian toward the region it should cover. When this magnitude stays high across iterations, gradient descent alone cannot fix it: the Gaussian needs to be cloned or split.

Cloning

When a small Gaussian has high positional gradient (the loss strongly wants it to move), it is under-reconstructing the region.

Solution

Duplicate the Gaussian — two copies converge to slightly different positions, together covering the under-reconstructed region.

Splitting

When a large Gaussian has high positional gradient, it is over-reconstructing — one blob covers a region that needs finer detail.

Solution

Replace with two smaller Gaussians placed along the gradient direction, each covering part of the over-reconstructed region at higher fidelity.

Pruning

Gaussians with opacity $\sigma_\alpha$ below a threshold are nearly invisible and waste parameters. They are deleted. Periodically, all opacities are reset to near-zero and re-learned — this prevents "opacity fog" where Gaussians persist with low but non-zero opacity.

Why Densification Matters for Compression

The final Gaussian count after training is not fixed — it emerges from the densification history. Methods that improve densification (Module 3) directly reduce the total parameter count before any encoding, which is often more impactful than encoding improvements applied to a fixed representation.

References

Kerbl et al. 2023 — 3D Gaussian Splatting for Real-Time Radiance Field Rendering

Previous Next →

Rendering Pipeline and Differentiable Training

From 3D Gaussians to a 2D Image

Step 1: Projection — 3D Gaussians to 2D Splats

Step 2: Tile-Based Rasterization

Step 3: Alpha Compositing

Training: Differentiable Rendering + Gradient Descent

Adaptive Densification

Cloning

Splitting

Pruning

Why Densification Matters for Compression

Privacy Policy

What we collect

What we don't collect

Your choices

Contact