Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Buildings Built in Minutes: Classical Structure from Motion Pipeline
Project type
University course project
Date
March 2026
Location
Worcester, MA
Tech stack: Python, NumPy, SciPy, OpenCV, SIFT, RANSAC, Levenberg-Marquardt, Bundle Adjustment, SVD
A from-scratch implementation of a complete classical Structure from Motion (SfM) pipeline that reconstructs 3D scene geometry and recovers camera poses from a set of monocular images. Built every stage of the multi-view geometry stack — from SIFT feature matching to global Bundle Adjustment — and validated the system on a custom 5-image dataset of WPI's Unity Hall, plus self-captured Church and Library scenes for extra credit.
Highlights
Implemented the full SfM pipeline end-to-end: SIFT feature matching with Lowe's ratio test, Fundamental Matrix estimation via the Normalized 8-Point Algorithm with RANSAC outlier rejection using Sampson distance.
Derived the Essential Matrix from F using camera intrinsics (E = KᵀFK) and enforced (1, 1, 0) singular-value structure for noise robustness.
Decomposed E into 4 candidate poses and resolved ambiguity via the cheirality condition — only the configuration placing all triangulated points in front of both cameras is kept.
Implemented both linear and non-linear triangulation, using Levenberg-Marquardt with Huber loss for robust reprojection-error minimization.
Built a Linear PnP + RANSAC pipeline for incremental camera registration, refined with Non-linear PnP using quaternion parameterization to enforce SO(3) orthogonality.
Implemented full Bundle Adjustment with a binary visibility matrix to jointly refine all camera poses and 3D points.
Captured and processed an additional dataset (Nikon D5600 — Library and Church scenes), including a detailed analysis of why incremental SfM failed to register all views due to insufficient 2D-3D correspondences for PnP.
Results (Unity Hall, 342 reconstructed points)
Reprojection RMSE reduced from 50.13 px (Linear PnP) → 0.62 px (Non-linear PnP) → 0.26 px (after Bundle Adjustment) — a ~190× improvement through geometric refinement.
Mean reprojection error of 0.20 px after BA, with max error 1.05 px.
Successfully reconstructed Unity Hall from 5 calibrated views, with all camera poses recovered and globally optimized.















