3D Gaussian Splatting
Dec 30, 2025 • 18 min read
Rendering photorealistic 3D scenes from video used to require hours of NeRF training followed by slow ray-marching rendering — far too slow for real-time applications. In 2023, 3D Gaussian Splatting (3DGS) solved both problems simultaneously: training in minutes and rendering at 100+ FPS. The paper from Inria created an explosion of applications in AR/VR, architectural visualization, game development, and e-commerce — anywhere photorealistic 3D captures from the real world are needed.
1. NeRF vs Gaussian Splatting: A Side-by-Side Comparison
- Implicit representation — a neural network stores the scene
- Rendering requires running NN millions of times per frame (ray marching)
- Training: 1-24 hours on GPU
- Rendering: 0.5-5 FPS (requires powerful GPU)
- ✓ Smoother, often better at reflections and transparency
- Explicit representation — scene = list of 3D ellipsoids
- Rendering via GPU rasterization (like video game graphics)
- Training: 15-45 minutes on GPU
- Rendering: 30-140 FPS (runs in browsers!)
- ✓ Faster, practical for real-time apps and web embedding
2. How Gaussian Splatting Works
3DGS represents a scene as a collection of millions of 3D Gaussian "splats" — each one an oriented ellipsoid with learned properties:
- Position (x, y, z) — where in 3D space the splat lives
- Covariance — the shape and orientation of the ellipsoid (thin disk, round ball, long tube)
- Color via Spherical Harmonics — color that changes based on viewing angle (captures specular highlights, view-dependent effects)
- Opacity (alpha) — how transparent the splat is
# The full 3DGS pipeline:
# 1. Capture: 30-100 photos/frames from different angles around the subject
# 2. Structure-from-Motion (SfM) with COLMAP
# → Estimates camera poses for each image
# → Produces sparse point cloud (~10k-100k points)
colmap automatic_reconstructor \
--workspace_path ./workspace \
--image_path ./images
# 3. Initialize Gaussians at sparse point cloud positions
# → Each point becomes one Gaussian splat
# 4. Differentiable rasterization training loop:
# For each training iteration:
# a) Rasterize current Gaussians from camera viewpoint
# b) Compare to ground truth image (L1 + SSIM loss)
# c) Backpropagate gradient to update Gaussian properties
# d) Adaptive control: split large Gaussians, prune ones with low opacity
# → After 30k iterations (~30 min): millions of optimized Gaussians
# 5. Output: .ply file with N million Gaussian parameters
# Typical scene: 2-6 million splats, 100-300MB .ply file3. Training with the Official Implementation
# Install gaussian-splatting
git clone https://github.com/graphdeco-inria/gaussian-splatting --recursive
conda env create --file environment.yml
conda activate gaussian_splatting
# Step 1: Run COLMAP to get camera poses
python convert.py -s /path/to/your/images
# Step 2: Train the Gaussian representation
python train.py \
-s /path/to/your/dataset \
--model_path ./output/my_scene \
--iterations 30000 # Standard quality
# --iterations 7000 # Quick preview quality
# --densify_grad_threshold 0.0002 # Controls when Gaussians are split
# Training output (logged every 1000 iterations):
# Iteration 1000 | Loss: 0.0821 | Num Gaussians: 12,453
# Iteration 5000 | Loss: 0.0234 | Num Gaussians: 89,234
# Iteration 15000| Loss: 0.0089 | Num Gaussians: 1,234,567
# Iteration 30000| Loss: 0.0052 | Num Gaussians: 2,891,234
# Step 3: Render novel views
python render.py \
-m ./output/my_scene \
--skip_train # Only render test views
# Step 4: Evaluate with PSNR/SSIM/LPIPS
python metrics.py -m ./output/my_scene4. Browser Embedding with WebGL/WebGPU Viewers
Because Gaussian Splats are explicit data (not neural networks), they can be rendered in the browser. Several open-source WebGL viewers exist:
<!-- Option 1: Luma AI WebGL Component (easiest) -->
<script type="module" src="https://unpkg.com/@lumaai/luma-web@latest/dist/library/luma-web.js"></script>
<luma-neural-field
src="https://lumalabs.ai/capture/your-capture-id"
style="width: 100%; height: 500px; border-radius: 12px;"
></luma-neural-field>
<!-- Option 2: 3D Gaussian Splat viewer (open source, loads .ply files) -->
<!-- npm install @mkkellogg/gaussian-splats-3d -->
<script type="module">
import * as GaussianSplats3D from '@mkkellogg/gaussian-splats-3d';
const viewer = new GaussianSplats3D.Viewer({
cameraUp: [0, -1, 0],
initialCameraPosition: [2, 2, 2],
initialCameraLookAt: [0, 0, 0],
});
// Loads .splat or .ply file — host on your CDN
viewer.addSplatScene('./my_scene.splat').then(() => {
viewer.start();
});
</script>
<!-- Option 3: Use Three.js with Gaussian Splat plugin -->
// Renders at 60fps on modern laptops, 30fps on mobile
// File size: convert .ply to .splat format (50% smaller) with:
// python convert_ply_to_splat.py model.ply model.splat5. Generative 3D: Splats from a Single Image
# LGM (Large Gaussian Model) — generates 3D splats from 1 image in 5 seconds
# Available as an API and on HuggingFace Hub
from lgm_inference_api import generate_3d
# Input: single image → Output: 3D Gaussian Splat .ply file
output_path = generate_3d(
image_path="product_photo.jpg",
num_views=4, # How many views to hallucinate
export_format="ply", # or "splat" for web-optimized format
)
print(f"3D model saved to {output_path}")
# Also: InstantMesh, TripoSR (faster), Wonder3D
# These enable:
# - E-commerce: single product photo → 360° 3D viewer
# - Game assets: concept art → 3D asset in seconds
# - AR: any object photo → placeable AR experienceFrequently Asked Questions
How many photos do I need to capture a room?
For an indoor room: 100-300 overlapping photos from all angles, ensuring no surface is photographed from only one direction. For outdoor objects: 50-150 photos. Use consistent lighting — avoid direct sunlight that changes between frames. Videos work too: 2-3 minutes of smooth walking video gives enough frames for high-quality reconstruction. Tools like RealityCapture and Polycam on iPhone can guide you through optimal capture paths.
What hardware do I need for training?
Training requires a CUDA-capable GPU with at least 8GB VRAM for small scenes, 16GB+ for large rooms. An RTX 4080 trains a typical room scene in ~20 minutes. The official implementation doesn't run on Apple Silicon (no CUDA), but alternative implementations like gsplat and Nerfstudio support MPS (Metal) on Macs with some performance penalty. Google Colab Pro ($10/month) provides A100 GPUs if you don't have a suitable local GPU.
Conclusion
3D Gaussian Splatting bridges the gap between photogrammetry (slow, mesh-based) and NeRF (slow rendering) by representing scenes as explicit, differentiable point clouds that render at game-speed. The technology's progression to generative 3D (single-image to splat in seconds) is already transforming e-commerce product visualization, game asset creation, and AR content. For developers, the WebGL embedding story makes it practical to ship interactive 3D captures in any web application today.
Continue Reading
Vivek
AI EngineerFull-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.