Full Pipeline: YouTube Video → GLOMAP → 3DGS → .splat
Lab 6: Full Pipeline
This lab runs the complete full pipeline demonstrated by Nicholas McCarty (Upskilled Consulting) — from a raw YouTube URL to a deployable .splat file viewable in any WebGL browser. You will instrument each stage with timing, inspect the intermediate artifacts, and convert the final output to both standard and simple PLY inputs.
Pipeline Overview
YouTube URL
↓ yt-dlp (video download)
downloaded_video.mp4
↓ ffmpeg -vf "fps=8"
images/ (~1,179 frames @ 640×272)
↓ COLMAP feature_extractor + exhaustive_matcher
database.db (133,000 pairs, 6,058 pruned)
↓ GLOMAP mapper (~70 min on T4)
sparse/0/ (585 cameras, 14,452 3D points)
↓ Depth-Anything-V2 (ViT-L)
depths/ (1,179 relative depth maps)
↓ make_depth_scale.py (affine alignment)
depth_scale.json (per-image scale/shift)
↓ gaussian-splatting train.py
--optimizer_type sparse_adam
--train_test_exp
output/*/point_cloud/iteration_30000/point_cloud.ply
↓ process_ply_to_splat()
point_cloud.splat (32 bytes/Gaussian)
Timing Budget (Colab A100)
| Stage | Approximate Duration |
|---|---|
| Video download + frame extraction | ~30 seconds |
| COLMAP feature extraction + matching | ~3 minutes |
| GLOMAP reconstruction | ~70 minutes |
| Depth-Anything-V2 inference | ~35 minutes |
| 3DGS training (30k iterations) | ~8 minutes |
| PLY → .splat conversion | ~10 seconds |
| Total | ~2 hours |
Part A: Diagnose a PLY File
Before running the full pipeline, practice schema inspection. Load the simple shipwreck PLY and the 3DGS output PLY, print their field names, and write a detect_ply_type dispatcher that routes each to the correct converter.
Part B: Implement the Sort Key
Instead of copying the converter directly, derive the opacity-weighted volume sort key from first principles:
- Confirm that
exp(scale_0 + scale_1 + scale_2)equalsexp(scale_0) * exp(scale_1) * exp(scale_2)(the product of activated scales) on a sample of Gaussians. - Plot the distribution of sort keys across all Gaussians — is it heavy-tailed? Log-normal?
- Compare the first 100 sorted Gaussians vs. the last 100: visualize their positions and sizes.
Part C: Measure the Compression
Compare file sizes at each stage:
- Raw float32
.ply: N × 59 × 4 bytes .splatbefore any further compression: N × 32 bytes.splatgzip-compressed: measure actual compressed size- Compare to SOG's approach: how does this differ from Module 5's encoding strategy?
Part D: Visualize the Reconstruction
Drop your .splat file into antimatter15's viewer and compare it to Nicholas McCarty's Bamburgh Castle result at nickmccarty.me/bamburgh-castle-splat. Document any artifacts you observe and hypothesize whether they stem from:
- SfM registration failures (missing cameras)
- Insufficient 3DGS training iterations
- Static sort approximation artifacts
- Depth prior scale alignment errors