SkillAgentSearch skills...

Simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions

Install / Use

/learn @nianticlabs/Simplerecon

README

SimpleRecon: 3D Reconstruction Without 3D Convolutions

This is the reference PyTorch implementation for training and testing MVS depth estimation models using the method described in

SimpleRecon: 3D Reconstruction Without 3D Convolutions

Mohamed Sayed, John Gibson, Jamie Watson, Victor Adrian Prisacariu, Michael Firman, and Clément Godard

Paper, ECCV 2022 (arXiv pdf), Supplemental Material, Project Page, Video

<p align="center"> <img src="media/teaser.jpeg" alt="example output" width="720" /> </p>

https://github.com/nianticlabs/simplerecon/assets/14994206/ae5074c2-6537-45f1-9f5e-0b3646a96dcb

https://user-images.githubusercontent.com/14994206/189788536-5fa8a1b5-ae8b-4f64-92d6-1ff1abb03eaf.mp4

This code is for non-commercial use; please see the license file for terms. If you do find any part of this codebase helpful, please cite our paper using the BibTex below and link this repo. Thanks!

🆕 Updates

25/05/2023: Fixed package verions for llvm-openmp, clang, and protobuf. Do use this new environment file if you have trouble running the code and/or if dataloading is being limited to a single thread.

09/03/2023: Added kornia version to the environments file to fix kornia typing issue. (thanks @natesimon!)

26/01/2023: The license has been modified to make running the model for academic reasons easier. Please the LICENSE file for the exact details.

There is an update as of 31/12/2022 that fixes slightly wrong intrinsics, flip augmentation for the cost volume, and a numerical precision bug in projection. All scores improve. You will need to update your forks and use new weights. See Bug Fixes.

Precomputed scans for online default frames are here: https://drive.google.com/drive/folders/1dSOFI9GayYHQjsx4I_NG0-3ebCAfWXjV?usp=share_link

Table of Contents

🗺️ Overview

SimpleRecon takes as input posed RGB images, and outputs a depth map for a target image.

⚙️ Setup

Assuming a fresh Anaconda distribution, you can install dependencies with:

conda env create -f simplerecon_env.yml

We ran our experiments with PyTorch 1.10, CUDA 11.3, Python 3.9.7 and Debian GNU/Linux 10.

📦 Models

Download a pretrained model into the weights/ folder.

We provide the following models (scores are with online default keyframes):

| --config | Model | Abs Diff↓| Sq Rel↓ | delta < 1.05↑| Chamfer↓ | F-Score↑ | |-------------|----------|--------------------|---------|---------|--------------|----------| | hero_model.yaml | Metadata + Resnet Matching | 0.0868 | 0.0127 | 74.26 | 5.69 | 0.680 | | dot_product_model.yaml | Dot Product + Resnet Matching | 0.0910 | 0.0134 | 71.90 | 5.92 | 0.667 |

hero_model is the one we use in the paper as Ours

🚀 Speed

| --config | Model | Inference Speed (--batch_size 1) | Inference GPU memory | Approximate training time | |------------|------------|------------|-------------------------|-----------------------------| | hero_model | Hero, Metadata + Resnet | 130ms / 70ms (speed optimized) | 2.6GB / 5.7GB (speed optimized) | 36 hours | | dot_product_model | Dot Product + Resnet | 80ms | 2.6GB | 36 hours |

With larger batches speed increases considerably. With batch size 8 on the non-speed optimized model, the latency drops to ~40ms.

📝 TODOs:

  • [x] Simple scan for folks to quickly try the code, instead of downloading the ScanNetv2 test scenes. DONE
  • [x] ScanNetv2 extraction, ~~ETA 10th October~~ DONE
  • [ ] FPN model weights.
  • ~~[ ] Tutorial on how to use Scanniverse data, ETA 5th October 10th October 20th October~~ At present there is no publically available way of exporting scans from Scanniverse. You'll have to use ios-logger; NeuralRecon have a good tutorial on this, and a dataloader that accepts the processed format is at datasets/arkit_dataset.py. UPDATE: There is now a quick readme data_scripts/IOS_LOGGER_ARKIT_README.md for how to process and run inference an ios-logger scan using the script at data_scripts/ios_logger_preprocessing.py.

🏃 Running out of the box!

We've now included two scans for people to try out immediately with the code. You can download these scans from here.

Steps:

  1. Download weights for the hero_model into the weights directory.
  2. Download the scans and unzip them to a directory of your choosing.
  3. Modify the value for the option dataset_path in configs/data/vdr_dense.yaml to the base path of the unzipped vdr folder.
  4. You should be able to run it! Something like this will work:
CUDA_VISIBLE_DEVICES=0 python test.py --name HERO_MODEL \
            --output_base_path OUTPUT_PATH \
            --config_file configs/models/hero_model.yaml \
            --load_weights_from_checkpoint weights/hero_model.ckpt \
            --data_config configs/data/vdr_dense.yaml \
            --num_workers 8 \
            --batch_size 2 \
            --fast_cost_volume \
            --run_fusion \
            --depth_fuser open3d \
            --fuse_color \
            --dump_depth_visualization;

This will output meshes, quick depth viz, and socres when benchmarked against LiDAR depth under OUTPUT_PATH.

This command uses vdr_dense.yaml which will generate depths for every frame and fuse them into a mesh. In the paper we report scores with fused keyframes instead, and you can run those using vdr_default.yaml. You can also use dense_offline tuples by instead using vdr_dense_offline.yaml.

See the section below on testing and evaluation. Make sure to use the correct config flags for datasets.

💾 ScanNetv2 Dataset

~~Please follow the instructions here to download the dataset. This dataset is quite big (>2TB), so make sure you have enough space, especially for extracting files.~~

~~Once downloaded, use this script to export raw sensor data to images and depth files.~~

We've written a quick tutorial and included modified scripts to help you with downloading and extracting ScanNetv2. You can find them at data_scripts/scannet_wrangling_scripts/

You should change the dataset_path config argument for ScanNetv2 data configs at configs/data/ to match where your dataset is.

The codebase expects ScanNetv2 to be in the following format:

dataset_path
    scans_test (test scans)
        scene0707
            scene0707_00_vh_clean_2.ply (gt mesh)
            sensor_data
                frame-000261.pose.txt
                frame-000261.color.jpg 
                frame-000261.color.512.png (optional, image at 512x384)
                frame-000261.color.640.png (optional, image at 640x480)
                frame-000261.depth.png (full res depth, stored scale *1000)
                frame-000261.depth.256.png (optional, depth at 256x192 also
                                            scaled)
            scene0707.txt (scan metadata and image sizes)
            intrinsic
                intrinsic_depth.txt
                intrinsic_color.txt
        ...
    scans (val and train scans)
        scene0000_00
            (see above)
        scene0000_01
        ....

In this example scene0707.txt should contain the scan's metadata:

    colorHeight = 968
    colorToDepthExtrinsics = 0.999263 -0.010031 0.037048 ........
    colorWidth = 1296
    depthHeight = 480
    depthWidth = 640
    fx_color = 1170.187988
    fx_depth = 570.924255
    fy_color = 1170.187988
    fy_depth = 570.924316
    mx_color = 647.750000
    mx_depth = 319.500000
    my_color = 483.750000
    my_depth = 239.500000
    numColorFrames = 784
    numDepthFrames = 784
    numIMUmeasurements = 1632

frame-000261.pose.txt should contain pose in the form:

    -0.384739 0.271466 -0.882203 4.98152
    0.921157 0.0521417 -0.385682 1

Related Skills

View on GitHub
GitHub Stars1.4k
CategoryDevelopment
Updated2d ago
Forks131

Languages

Python

Security Score

85/100

Audited on Apr 3, 2026

No findings