The .splat Binary Format: Opacity Sorting, SH Coefficients & Byte Layout
- Decode the 32-byte .splat format layout — position (12 B), scale (12 B), color+alpha (4 B), rotation (4 B) — and identify how each attribute is quantized from float32
- Explain the normalization applied to the DC spherical harmonic coefficient for .splat uint8 color storage, including the SH₀ constant factor and the sigmoid-to-uint8 opacity encoding
- Explain the opacity-weighted volume sort heuristic V_k·α_k and state why it produces a view-independent Gaussian ordering suitable for real-time rendering without per-frame sorting
- Implement a field-aware PLY-to-.splat converter that detects whether a PLY file is a standard 3DGS PLY or a simple point cloud and dispatches to the correct conversion path
Two Formats, One Pipeline
A completed 3DGS training run produces a .ply file — a standard polygon mesh container storing every Gaussian's full attribute set. A web viewer such as antimatter15's splat.js consumes a .splat file — a compact binary format designed for streaming and real-time rendering in the browser. Converting between them requires understanding both formats precisely, and the conversion encodes several non-trivial mathematical decisions.
The 3DGS .ply Attribute Schema
A 3DGS .ply file stores one vertex per Gaussian with the following fields (59 floats total):
| Field group | Field names | Count | Semantics |
|---|---|---|---|
| Position | x, y, z |
3 | World-space mean |
| Normals | nx, ny, nz |
3 | Unused in 3DGS; always zero |
| DC SH | f_dc_0, f_dc_1, f_dc_2 |
3 | Zeroth-order spherical harmonic coefficients for RGB |
| AC SH | f_rest_0, …, f_rest_44 |
45 | Higher-order view-dependent SH coefficients |
| Opacity | opacity |
1 | Pre-sigmoid opacity (raw logit) |
| Scale | scale_0, scale_1, scale_2 |
3 | Log-scale axes |
| Rotation | rot_0, rot_1, rot_2, rot_3 |
4 | Unit quaternion (raw, before normalization) |
Total: 3+3+3+45+1+3+4 = 62 fields at float32 = 248 bytes per Gaussian.
A simple point cloud .ply (e.g., from COLMAP or a LiDAR scanner) has only x, y, z, red, green, blue. When you attempt to access scale_0 on such a file, you get:
ValueError: no field of name scale_0
The correct diagnostic is to inspect the available fields before assuming a format:
plydata = PlyData.read(path)
print(plydata['vertex'].data.dtype.names)
# ('x', 'y', 'z', 'red', 'green', 'blue') ← simple cloud
# ('x', 'y', 'z', 'nx', 'ny', 'nz', 'f_dc_0', ..., 'rot_3') ← 3DGS
The .splat Binary Format: 32 Bytes per Gaussian
The .splat format strips all data the browser renderer doesn't need and reencodes the remainder in a compact binary layout. Each Gaussian occupies exactly 32 bytes:
Bytes 0–11 │ Position │ 3 × float32 │ x, y, z (raw world-space)
Bytes 12–23 │ Scale │ 3 × float32 │ exp(scale_0), exp(scale_1), exp(scale_2)
Bytes 24–27 │ Color + α │ 4 × uint8 │ R, G, B (SH₀), α (sigmoid)
Bytes 28–31 │ Rotation │ 4 × uint8 │ q_w, q_x, q_y, q_z (normalized, encoded)
Total: 12 + 12 + 4 + 4 = 32 bytes vs. 248 bytes → 7.75× size reduction before any compression. The viewer reconstructs scale from the stored values via identity (already exp-activated); it reconstructs opacity via sigmoid (already applied); it reconstructs rotation by reversing the uint8 encoding.
The Zeroth-Order Spherical Harmonic Coefficient
The DC color fields f_dc_0, f_dc_1, f_dc_2 are the coefficients of the zeroth-order real spherical harmonic — the constant function on the sphere:
This is derived from the normalization condition .
The full spherical harmonic color model in 3DGS is:
where for the standard 3DGS SH degree. The 16 real SH basis functions for are shown in the interactive viewer below — green lobes indicate positive values, red indicate negative. Click any function to see its interpretation in the 3DGS context.
The DC contribution — the view-independent base color — is:
The .splat viewer uses only this DC term (no view-dependent AC), so the color encoding is:
The 0.5 offset ensures that f_dc = 0 maps to R = 128 (neutral gray), not black. The scale ensures that drives the channel to 0 or 255 — the realistic range of trained 3DGS coefficients.
Opacity Encoding
The opacity field stores a pre-sigmoid logit . The activated opacity (probability of being opaque) is:
In the .splat format, this is stored as a uint8:
The browser decodes — no sigmoid needed at render time.
Scale Encoding
The scale_k fields store log-scale values — the log of each Gaussian's half-axis length. In the .splat format:
This stores the activated (positive) scale directly as a float32. The Gaussian's 3D covariance is reconstructed in the browser as:
where is the rotation matrix from the quaternion and .
Quaternion Encoding
The rotation quaternion is stored in the .ply as raw floats (not guaranteed to be unit norm due to training dynamics). The .splat encoder normalizes and then maps to uint8:
rot_normalized = q / np.linalg.norm(q) # unit quaternion
rot_bytes = (rot_normalized * 128 + 128) # map [-1,1] → [0,256]
.clip(0, 255).astype(np.uint8)
Decoding inverts this: , followed by renormalization. The quantization error is per component — small enough that orientation artifacts are imperceptible for typical Gaussians.
The Opacity-Weighted Volume Sort
The .splat format requires pre-sorted Gaussians. Web viewer alpha compositing traverses the array in order, blending each Gaussian's color with the accumulated transmittance:
For depth-correct rendering, Gaussians should be sorted front-to-back or back-to-front relative to the viewer. However, a static .splat file has no per-frame depth information. Instead, the converter uses a view-independent importance heuristic:
where:
- is proportional to the log-volume of the Gaussian ellipsoid
- is the activated opacity
Gaussians are sorted descending by this key — large, opaque Gaussians first. The rationale: large opaque Gaussians dominate the visual contribution regardless of viewpoint. Rendering them early saturates the transmittance quickly, correctly obscuring smaller Gaussians behind them. Small or nearly transparent Gaussians contribute little regardless of order.
This is an approximation: it can produce artifacts at oblique views where a small-but-foreground Gaussian is obscured by a large-but-background one. Depth-sorted rendering (computed per frame on the GPU) would be exact but is too expensive for real-time browser use. The sort heuristic works well in practice for scenes with normal viewing distance distributions.
A Robust Field-Aware Converter
Production converters must handle both simple and 3DGS PLY files:
def detect_ply_type(plydata):
fields = set(plydata['vertex'].data.dtype.names)
required_3dgs = {'scale_0', 'scale_1', 'scale_2', 'opacity',
'f_dc_0', 'f_dc_1', 'f_dc_2',
'rot_0', 'rot_1', 'rot_2', 'rot_3'}
if required_3dgs.issubset(fields):
return '3dgs'
elif {'red', 'green', 'blue'}.issubset(fields):
return 'simple'
else:
raise ValueError(f"Unrecognized PLY schema: {fields}")
def convert(plydata, ply_type):
if ply_type == '3dgs':
return process_ply_to_splat(plydata) # full encoder
elif ply_type == 'simple':
return process_simple_ply_to_splat(plydata) # fixed-scale fallback
The simple fallback assigns unit scale and identity rotation — representing each point as a small spherical Gaussian — which is geometrically approximate but sufficient for point-cloud preview. This pattern — inspect schema, dispatch to type-specific handler, fail explicitly on unknown schema — will recur in Module 8 as a foundational enterprise application pattern.