Simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions

Generate Convert Improve

Install / Use

/learn @nianticlabs/Simplerecon

About this skill

Quality Score

0/100

README

SimpleRecon: 3D Reconstruction Without 3D Convolutions

This is the reference PyTorch implementation for training and testing MVS depth estimation models using the method described in

SimpleRecon: 3D Reconstruction Without 3D Convolutions

Mohamed Sayed, John Gibson, Jamie Watson, Victor Adrian Prisacariu, Michael Firman, and Clément Godard

Paper, ECCV 2022 (arXiv pdf), Supplemental Material, Project Page, Video

https://github.com/nianticlabs/simplerecon/assets/14994206/ae5074c2-6537-45f1-9f5e-0b3646a96dcb

https://user-images.githubusercontent.com/14994206/189788536-5fa8a1b5-ae8b-4f64-92d6-1ff1abb03eaf.mp4

This code is for non-commercial use; please see the license file for terms. If you do find any part of this codebase helpful, please cite our paper using the BibTex below and link this repo. Thanks!

🆕 Updates

25/05/2023: Fixed package verions for llvm-openmp, clang, and protobuf. Do use this new environment file if you have trouble running the code and/or if dataloading is being limited to a single thread.

09/03/2023: Added kornia version to the environments file to fix kornia typing issue. (thanks @natesimon!)

26/01/2023: The license has been modified to make running the model for academic reasons easier. Please the LICENSE file for the exact details.

There is an update as of 31/12/2022 that fixes slightly wrong intrinsics, flip augmentation for the cost volume, and a numerical precision bug in projection. All scores improve. You will need to update your forks and use new weights. See Bug Fixes.

Precomputed scans for online default frames are here: https://drive.google.com/drive/folders/1dSOFI9GayYHQjsx4I_NG0-3ebCAfWXjV?usp=share_link

🗺️ Overview
⚙️ Setup
📦 Models
🚀 Speed
📝 TODOs:
🏃 Running out of the box!
💾 ScanNetv2 Dataset
🖼️🖼️🖼️ Frame Tuples
📊 Testing and Evaluation
👉☁️ Point Cloud Fusion
📊 Mesh Metrics
⏳ Training
- 🎛️ Finetuning a pretrained model
🔧 Other training and testing options
✨ Visualization
📝🧮👩‍💻 Notation for Transformation Matrices
🗺️ World Coordinate System
🐜🔧 Bug Fixes
🗺️💾 COLMAP Dataset
🙏 Acknowledgements
📜 BibTeX
👩‍⚖️ License

🗺️ Overview

SimpleRecon takes as input posed RGB images, and outputs a depth map for a target image.

⚙️ Setup

Assuming a fresh Anaconda distribution, you can install dependencies with:

conda env create -f simplerecon_env.yml

We ran our experiments with PyTorch 1.10, CUDA 11.3, Python 3.9.7 and Debian GNU/Linux 10.

📦 Models

Download a pretrained model into the weights/ folder.

We provide the following models (scores are with online default keyframes):

| --config | Model | Abs Diff↓| Sq Rel↓ | delta < 1.05↑| Chamfer↓ | F-Score↑ | |-------------|----------|--------------------|---------|---------|--------------|----------| | hero_model.yaml | Metadata + Resnet Matching | 0.0868 | 0.0127 | 74.26 | 5.69 | 0.680 | | dot_product_model.yaml | Dot Product + Resnet Matching | 0.0910 | 0.0134 | 71.90 | 5.92 | 0.667 |

hero_model is the one we use in the paper as Ours

🚀 Speed

| --config | Model | Inference Speed (--batch_size 1) | Inference GPU memory | Approximate training time | |------------|------------|------------|-------------------------|-----------------------------| | hero_model | Hero, Metadata + Resnet | 130ms / 70ms (speed optimized) | 2.6GB / 5.7GB (speed optimized) | 36 hours | | dot_product_model | Dot Product + Resnet | 80ms | 2.6GB | 36 hours |

With larger batches speed increases considerably. With batch size 8 on the non-speed optimized model, the latency drops to ~40ms.

📝 TODOs:

[x] Simple scan for folks to quickly try the code, instead of downloading the ScanNetv2 test scenes. DONE
[x] ScanNetv2 extraction, ~~ETA 10th October~~ DONE
[ ] FPN model weights.
~~[ ] Tutorial on how to use Scanniverse data, ETA 5th October 10th October 20th October~~ At present there is no publically available way of exporting scans from Scanniverse. You'll have to use ios-logger; NeuralRecon have a good tutorial on this, and a dataloader that accepts the processed format is at datasets/arkit_dataset.py. UPDATE: There is now a quick readme data_scripts/IOS_LOGGER_ARKIT_README.md for how to process and run inference an ios-logger scan using the script at data_scripts/ios_logger_preprocessing.py.

🏃 Running out of the box!

We've now included two scans for people to try out immediately with the code. You can download these scans from here.

Steps:

Download weights for the hero_model into the weights directory.
Download the scans and unzip them to a directory of your choosing.
Modify the value for the option dataset_path in configs/data/vdr_dense.yaml to the base path of the unzipped vdr folder.
You should be able to run it! Something like this will work:

CUDA_VISIBLE_DEVICES=0 python test.py --name HERO_MODEL \
            --output_base_path OUTPUT_PATH \
            --config_file configs/models/hero_model.yaml \
            --load_weights_from_checkpoint weights/hero_model.ckpt \
            --data_config configs/data/vdr_dense.yaml \
            --num_workers 8 \
            --batch_size 2 \
            --fast_cost_volume \
            --run_fusion \
            --depth_fuser open3d \
            --fuse_color \
            --dump_depth_visualization;

This will output meshes, quick depth viz, and socres when benchmarked against LiDAR depth under OUTPUT_PATH.

This command uses vdr_dense.yaml which will generate depths for every frame and fuse them into a mesh. In the paper we report scores with fused keyframes instead, and you can run those using vdr_default.yaml. You can also use dense_offline tuples by instead using vdr_dense_offline.yaml.

See the section below on testing and evaluation. Make sure to use the correct config flags for datasets.

💾 ScanNetv2 Dataset

~~Please follow the instructions here to download the dataset. This dataset is quite big (>2TB), so make sure you have enough space, especially for extracting files.~~

~~Once downloaded, use this script to export raw sensor data to images and depth files.~~

We've written a quick tutorial and included modified scripts to help you with downloading and extracting ScanNetv2. You can find them at data_scripts/scannet_wrangling_scripts/

You should change the dataset_path config argument for ScanNetv2 data configs at configs/data/ to match where your dataset is.

The codebase expects ScanNetv2 to be in the following format:

dataset_path
    scans_test (test scans)
        scene0707
            scene0707_00_vh_clean_2.ply (gt mesh)
            sensor_data
                frame-000261.pose.txt
                frame-000261.color.jpg 
                frame-000261.color.512.png (optional, image at 512x384)
                frame-000261.color.640.png (optional, image at 640x480)
                frame-000261.depth.png (full res depth, stored scale *1000)
                frame-000261.depth.256.png (optional, depth at 256x192 also
                                            scaled)
            scene0707.txt (scan metadata and image sizes)
            intrinsic
                intrinsic_depth.txt
                intrinsic_color.txt
        ...
    scans (val and train scans)
        scene0000_00
            (see above)
        scene0000_01
        ....

In this example scene0707.txt should contain the scan's metadata:

    colorHeight = 968
    colorToDepthExtrinsics = 0.999263 -0.010031 0.037048 ........
    colorWidth = 1296
    depthHeight = 480
    depthWidth = 640
    fx_color = 1170.187988
    fx_depth = 570.924255
    fy_color = 1170.187988
    fy_depth = 570.924316
    mx_color = 647.750000
    mx_depth = 319.500000
    my_color = 483.750000
    my_depth = 239.500000
    numColorFrames = 784
    numDepthFrames = 784
    numIMUmeasurements = 1632

frame-000261.pose.txt should contain pose in the form:

    -0.384739 0.271466 -0.882203 4.98152
    0.921157 0.0521417 -0.385682 1

Related Skills

node-connect

349.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

nianticlabs

View profile

View on GitHub

GitHub Stars1.4k

CategoryDevelopment

Updated2d ago

Forks131

nianticlabs/simplerecon

Languages

Python

Security Score

85/100

Audited on Apr 3, 2026

No findings

Simplerecon

Install / Use

README

SimpleRecon: 3D Reconstruction Without 3D Convolutions

🆕 Updates

Table of Contents

🗺️ Overview

⚙️ Setup

📦 Models

🚀 Speed

📝 TODOs:

🏃 Running out of the box!

💾 ScanNetv2 Dataset

Related Skills