Drive-Through 3D Vehicle Exterior Reconstruction via Dynamic-Scene SfM and Distortion-Aware Gaussian Splatting

Generative AI & LLMs
Published: arXiv: 2603.26638v1
Authors

Nitin Kulkarni Akhil Devarashetti Charlie Cluss Livio Forte Philip Schneider Chunming Qiao Alina Vereshchaka

Abstract

High-fidelity 3D reconstruction of vehicle exteriors improves buyer confidence in online automotive marketplaces, but generating these models in cluttered dealership drive-throughs presents severe technical challenges. Unlike static-scene photogrammetry, this setting features a dynamic vehicle moving against heavily cluttered, static backgrounds. This problem is further compounded by wide-angle lens distortion, specular automotive paint, and non-rigid wheel rotations that violate classical epipolar constraints. We propose an end-to-end pipeline utilizing a two-pillar camera rig. First, we resolve dynamic-scene ambiguities by coupling SAM 3 for instance segmentation with motion-gating to cleanly isolate the moving vehicle, explicitly masking out non-rigid wheels to enforce strict epipolar geometry. Second, we extract robust correspondences directly on raw, distorted 4K imagery using the RoMa v2 learned matcher guided by semantic confidence masks. Third, these matches are integrated into a rig-aware SfM optimization that utilizes CAD-derived relative pose priors to eliminate scale drift. Finally, we use a distortion-aware 3D Gaussian Splatting framework (3DGUT) coupled with a stochastic Markov Chain Monte Carlo (MCMC) densification strategy to render reflective surfaces. Evaluations on 25 real-world vehicles across 10 dealerships demonstrate that our full pipeline achieves a PSNR of 28.66 dB, an SSIM of 0.89, and an LPIPS of 0.21 on held-out views, representing a 3.85 dB improvement over standard 3D-GS, delivering inspection-grade interactive 3D models without controlled studio infrastructure.

Paper Summary

Problem
The main problem this paper addresses is the challenge of creating high-fidelity 3D models of vehicle exteriors in cluttered dealership drive-throughs. This setting is difficult because the vehicle is moving, the background is cluttered, and the vehicle's wheels are rotating, which makes it hard to get stable 3D reconstructions.
Key Innovation
The innovation of this work is an end-to-end pipeline that uses a combination of classical multi-view geometry and distortion-aware 3D Gaussian Splatting to reconstruct photorealistic, interactive 3D models of vehicles captured in drive-through environments. This pipeline includes a motion-gated semantic isolation strategy to separate the moving vehicle from the cluttered background, a learned matcher to extract robust correspondences, and a distortion-aware 3D Gaussian Splatting framework to render reflective surfaces.
Practical Impact
This research has practical applications in online automotive marketplaces, where buyers can use interactive 3D models to inspect vehicles remotely. This can improve buyer confidence and reduce re-inspection costs. Additionally, this technology can be used in wholesale auctions and dealership showrooms to provide a more immersive and accurate experience for customers.
Analogy / Intuitive Explanation
Imagine trying to take a 3D photo of a moving car in a crowded parking lot. The car's wheels are spinning, and the background is full of distractions. This is similar to the challenge of capturing a high-quality 3D model of a vehicle in a drive-through environment. The solution proposed in this paper is like having a superpower that allows you to freeze time, remove the background clutter, and render the car's reflective surfaces in stunning detail. This enables the creation of interactive, photorealistic 3D models that can be used for various applications in the automotive industry.
Paper Information
Categories:
cs.CV cs.RO
Published Date:

arXiv ID:

2603.26638v1

Quick Actions