3D Gaussian Splatting · End-to-End Scene Reconstruction

Full Pipeline: YouTube Video → GLOMAP → 3DGS → .splat

Colab Notebook · ~120 min
Google Colab Notebook
Full Pipeline: YouTube Video → GLOMAP → 3DGS → .splat
Python · ~120 min
Open in Colab
Lab Objectives
1
Instrument each pipeline stage with wall-clock timing and produce a summary DataFrame matching the workflow table from the reference notebook.
2
Inspect the SfM output: count registered cameras, visualize camera positions in 3D, and plot the distribution of track lengths.
3
Implement a field-aware PLY dispatcher that handles both simple (x,y,z,rgb) and full 3DGS PLY schemas without errors.
4
Derive the opacity-weighted volume sort key from the mathematical definition and validate it against the reference implementation.
5
Measure the per-stage compression ratio from float32 .ply through .splat through gzip, and compare to the SOG pipeline from Module 5.
6
Load the .splat file in a WebGL viewer and document observable artifacts with hypotheses about their source in the pipeline.

Lab 6: Full Pipeline

This lab runs the complete full pipeline demonstrated by Nicholas McCarty (Upskilled Consulting) — from a raw YouTube URL to a deployable .splat file viewable in any WebGL browser. You will instrument each stage with timing, inspect the intermediate artifacts, and convert the final output to both standard and simple PLY inputs.

Pipeline Overview

YouTube URL
    ↓  yt-dlp (video download)
downloaded_video.mp4
    ↓  ffmpeg -vf "fps=8"
images/ (~1,179 frames @ 640×272)
    ↓  COLMAP feature_extractor + exhaustive_matcher
database.db (133,000 pairs, 6,058 pruned)
    ↓  GLOMAP mapper (~70 min on T4)
sparse/0/  (585 cameras, 14,452 3D points)
    ↓  Depth-Anything-V2 (ViT-L)
depths/ (1,179 relative depth maps)
    ↓  make_depth_scale.py (affine alignment)
depth_scale.json (per-image scale/shift)
    ↓  gaussian-splatting train.py
       --optimizer_type sparse_adam
       --train_test_exp
output/*/point_cloud/iteration_30000/point_cloud.ply
    ↓  process_ply_to_splat()
point_cloud.splat (32 bytes/Gaussian)

Timing Budget (Colab A100)

Stage Approximate Duration
Video download + frame extraction ~30 seconds
COLMAP feature extraction + matching ~3 minutes
GLOMAP reconstruction ~70 minutes
Depth-Anything-V2 inference ~35 minutes
3DGS training (30k iterations) ~8 minutes
PLY → .splat conversion ~10 seconds
Total ~2 hours

Part A: Diagnose a PLY File

Before running the full pipeline, practice schema inspection. Load the simple shipwreck PLY and the 3DGS output PLY, print their field names, and write a detect_ply_type dispatcher that routes each to the correct converter.

Part B: Implement the Sort Key

Instead of copying the converter directly, derive the opacity-weighted volume sort key from first principles:

  1. Confirm that exp(scale_0 + scale_1 + scale_2) equals exp(scale_0) * exp(scale_1) * exp(scale_2) (the product of activated scales) on a sample of Gaussians.
  2. Plot the distribution of sort keys across all Gaussians — is it heavy-tailed? Log-normal?
  3. Compare the first 100 sorted Gaussians vs. the last 100: visualize their positions and sizes.

Part C: Measure the Compression

Compare file sizes at each stage:

  • Raw float32 .ply: N × 59 × 4 bytes
  • .splat before any further compression: N × 32 bytes
  • .splat gzip-compressed: measure actual compressed size
  • Compare to SOG's approach: how does this differ from Module 5's encoding strategy?

Part D: Visualize the Reconstruction

Drop your .splat file into antimatter15's viewer and compare it to Nicholas McCarty's Bamburgh Castle result at nickmccarty.me/bamburgh-castle-splat. Document any artifacts you observe and hypothesize whether they stem from:

  • SfM registration failures (missing cameras)
  • Insufficient 3DGS training iterations
  • Static sort approximation artifacts
  • Depth prior scale alignment errors