Gtsfm
End-to-end SFM framework based on GTSAM
Install / Use
/learn @borglab/GtsfmREADME

| Platform | Build Status |
|:------------:| :-------------:|
| Ubuntu 20.04.3 | |
Georgia Tech Structure-from-Motion (GTSfM) is an end-to-end SfM pipeline based on GTSAM. GTSfM was designed from the ground-up to natively support parallel computation using Dask.
For more details, please refer to our arXiv preprint.
<p align="left"> <img src="https://user-images.githubusercontent.com/16724970/121294002-a4d7a400-c8ba-11eb-895e-a50305c049b6.gif" height="315" title="Olsson Lund Dataset: Door, 12 images"> <img src="https://user-images.githubusercontent.com/16724970/142500100-ed3bd07b-f839-488e-a01d-823a9fbeaba4.gif" height="315"> </p> <p align="left"> <img src="https://user-images.githubusercontent.com/25347892/146043166-c5a172d7-17e0-4779-8333-8cd5f088ea2e.gif" height="345" title="2011212_opnav_022"> <img src="https://user-images.githubusercontent.com/25347892/146043553-5299e9d3-44c5-40a6-8ba8-ff43d2a28c8f.gif" height="345"> </p>License
The majority of our code is governed by an MIT license and is suitable for commercial use. However, certain implementations featured in our repo (e.g., SuperPoint, SuperGlue) are governed by a non-commercial license and may not be used commercially.
Installation
GTSfM requires no compilation, as Python wheels are provided for GTSAM.
Initialize Git submodules
This repository includes external repositories as Git submodules, so, unless you cloned with git clone --recursive you need to initialize:
git submodule update --init --recursive
GTSfM supports two installation methods. Choose the one that best fits your workflow:
Option 1: Conda Setup (Recommended for most users)
For detailed Conda installation instructions, see conda-setup.md
Option 2: UV Setup (Fast alternative package manager)
For detailed UV installation instructions, see uv-setup.md
Both methods will allow you to run GTSfM successfully.
Try It on Google Colab
For a quick hands-on example, check out this Colab notebook
Usage Guide (Running 3D Reconstruction)
Before running reconstruction, if you intend to use modules with pre-trained weights (e.g., SuperPoint, SuperGlue, or PatchmatchNet), first download the model weights by running:
bash scripts/download_model_weights.sh
Running SfM
GTSfM provides a unified runner that supports all dataset types through Hydra configuration.
To process a dataset containing only an image directory and EXIF metadata, ensure your dataset follows this structure:
└── {DATASET_NAME}
├── images
├── image1.jpg
├── image2.jpg
├── image3.jpg
Then, run the following command:
./run --config_name {CONFIG_NAME} --loader olsson --dataset_dir {DATASET_DIR} --num_workers {NUM_WORKERS}
Loader Options
The runner exposes five portable CLI arguments for dataset selection and universal loader configuration:
--loader— which loader to use (e.g.,olsson,colmap)--dataset_dir— path to the dataset root--images_dir— optional path to the image directory (defaults depend on loader)--max_resolution— maximum length of the image’s short side (overrides config)--input_worker— optional Dask worker address to pin image I/O (advanced; runner sets this post‑instantiation)
All other loader‑specific settings (anything beyond the five above) must be specified using Hydra overrides on the nested config node loader.*. This is standard Hydra behavior: use dot‑notation keys with = assignments.
To discover all available overrides for a given loader, open its YAML in gtsfm/configs/loader/
Required Image Metadata
Currently, we require EXIF data embedded into your images. Alternatively, you can provide:
- Ground truth intrinsics in the expected format for an Olsson dataset
- COLMAP-exported text data
Additional CLI Arguments
--run_mvs— enables dense Multi-View Stereo (MVS) reconstruction after the sparse SfM pipeline.--run_gs— enables Gaussian Splatting for dense scene representation.
Many other dask-related arguments are available. Run
./run --help
for more information.
Examples
Example (deep front-end on Olsson, single worker):
./run --dataset_dir tests/data/set1_lund_door \
--config_name deep_front_end.yaml \
--loader olsson \
--num_workers 1 \
loader.max_resolution=1200
For a dataset with metadata formatted in the COLMAP style:
./run --dataset_dir datasets/gerrard-hall \
--config_name deep_front_end.yaml \
--loader colmap \
--num_workers 5 \
loader.use_gt_intrinsics=true \
loader.use_gt_extrinsics=true
You can monitor the distributed computation using the Dask dashboard.
Note: The dashboard will only display activity while tasks are actively running, but comprehensive performance reports can be found in the dask_reports folder.
Comparing GTSFM Output with COLMAP Output
To compare GTSFM output with COLMAP, use the following command:
./run --config_name {CONFIG_NAME} --loader colmap --dataset_dir {DATASET_DIR} --num_workers {NUM_WORKERS} --max_frame_lookahead {MAX_FRAME_LOOKAHEAD}
Visualizing Results with Open3D
To visualize the reconstructed scene using Open3D, run:
python gtsfm/visualization/view_scene.py
Speeding Up Front-End Processing
For users who work with the same dataset repeatedly, GTSFM allows caching front-end results for faster inference.
Refer to the detailed guide:
📄 GTSFM Front-End Cacher README
Running GTSFM on a Multi-Machine Cluster
For users who want to run GTSFM on a cluster of multiple machines, follow the setup instructions here:
📄 CLUSTER.md
Where Are the Results Stored?
- The output will be saved in
--output_root, which defaults to theresultsfolder in the repo root. - Poses and 3D tracks are stored in COLMAP format inside the
ba_outputsubdirectory of--output_root. - You can visualize these using the COLMAP GUI.
Nerfstudio
We provide a preprocessing script to convert the camera poses estimated by GTSfM to nerfstudio format:
python scripts/prepare_nerfstudio.py --results_path {RESULTS_DIR} --images_dir {IMAGES_DIR}
The results are stored in the nerfstudio_input subdirectory inside {RESULTS_DIR}, which can be used directly with nerfstudio if installed:
ns-train nerfacto --data {RESULTS_DIR}/nerfstudio_input
More Loader Details
The runner supports all loaders through --loader, --dataset_dir, and --images_dir. Any additional, loader‑specific settings are passed as Hydra overrides on the nested node loader.* (this is standard Hydra usage).
General pattern
./run \
--config_name <config_file> \
--loader <loader_type> \
--dataset_dir <path> \
[--images_dir <path>] \
[--max_resolution <int>] \
[--input_worker <address>] \
loader.<param>=<value> \
[loader.<param2>=<value2> ...]
Available Loaders
The following loader types are supported:
colmap- COLMAP format datasetshilti- Hilti SLAM challenge datasetsastrovision- AstroVision space datasetsolsson- Olsson format datasetsargoverse- Argoverse autonomous driving datasetsmobilebrick- MobileBrick datasetsone_d_sfm- 1DSFM format datasetstanks_and_temples- Tanks and Temples benchmark datasetsyfcc_imb- YFCC Image Matching Benchmark datasets
For the complete list of available arguments for each loader, run:
./run --help
Example: Olsson Loader (images + EXIF)
./run \
--config_name sift_front_end.yaml \
--loader olsson \
--dataset_dir /path/to/olsson_dataset \
loader.max_resolution=1200
Example: Colmap Loader (COLMAP text export)
./run \
--config_name sift_front_end.yaml \
--loader colmap \
--dataset_dir /path/to/colmap_dataset \
loader.use_gt_intrinsics=true \
loader.use_gt_extrinsics=true
Tip: consult
gtsfm/configs/loader/<loader_name>.yamlfor the full set of fields supported by each loader.
Repository Structure
GTSfM is designed in a modular way. Each module can be swapped out with a new one, as long as it implements the API of the module's abstract base class. The code is organized as follows:
gtsfm: source code, organized as:averagingbundle: bundle adjustment implementationscommon: basic classes used through GTSFM, such asKeypoints,Image,SfmTrack2d, etcdata_association: 3d point triangulation (DLT) w/ or w/o RANSAC, from 2d point-tracksdensifyfrontend: SfM front-end code, including:detector: keypoint detector implementations (DoG, etc)descriptor: feature descriptor implementations (SIFT, SuperPoint etc)matcher: descriptor matching implementations ([S
Related Skills
node-connect
338.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
338.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.4kCommit, push, and open a PR
