DoubleTake: Geometry Guided Depth Estimation

This is the reference PyTorch implementation for training and testing MVS depth estimation models using the method described in

DoubleTake: Geometry Guided Depth Estimation

Mohamed Sayed, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Guillermo Garcia-Hernando, Gabriel Brostow, Sara Vicente and Michael Firman.

Paper, ECCV 2024 (arXiv pdf), Supplemental Material, Project Page, Video

https://github.com/user-attachments/assets/aa2052df-79f4-43a8-ab24-d704660f228a

Please, refer to the the license file for terms of usage. If you use this codebase in your research, please consider citing our paper using the BibTex below and linking this repo. Thanks!

🗺️ Overview
⚙️ Setup
📦 Trained Models and Precomputed Meshes/Scores
🚀 Speed
🏃 Running out of the box!
💾 ScanNetv2 Dataset
💾 SimpleRecon ScanNet Training Depth Renders
💾 3RScan Dataset
📊 Testing and Evaluation
📊 Mesh Metrics
📝🧮👩‍💻 Notation for Transformation Matrices
🗺️ World Coordinate System
🔨💾 Training Data Preperation
🙏 Acknowledgements
📜 BibTeX
👩‍⚖️ License

🗺️ Overview

DoubleTake takes as input posed RGB images, and outputs a depth map for a target image. Under the hood, it uses a mesh it itself builds either online (incrementally) or offline (mesh built on one pass and used for better depth on the second pass) to improve its own depth estimates.

https://github.com/user-attachments/assets/269c658a-7325-4b52-98ab-bd3505f045db

⚙️ Setup

We are going to create a new Mamba environment called doubletake. If you don't have Mamba, you can install it with:

make install-mamba

Then setup the environment with:

make create-mamba-env
mamba activate doubletake

In the code directory, install the repo as a pip package:

pip install -e .

Some C++ code will compile JIT using ninja the first time you use any of the fusers. Should be quick.

In case you don't have this in your ~/.bashrc already, you should run:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/

If you get a GLIBCXX_3.4.29 not found error it's very likely this.

📦 Trained Models and Precomputed Meshes/Scores

We provide three models. The standard DoubleTake model used for incremental, offline, and revisit evaluation on all datasets and figures in the paper, a slimmed down faster version of DoubleTake, and the vanilla SimpleRecon model we used for SimpleRecon scores. Use the links in the table to access the weights for each. The scores here are very slightly different (better) than those in the paper due to a slight bug fix in training data renders.

Download a pretrained model into the weights/ folder.

Scores on ScanNet: | Model | Config | Weights | Notes | |---------------|---------------------------------------|-----------------------------------------------------------------------|---------------| | SimpleRecon | configs/models/simplerecon_model.yaml | Link | | | DoubleTake Small | configs/models/doubletake_small_model.yaml | Link | | | DoubleTake | configs/models/doubletake_model.yaml | Link | ours in the paper |

| Offline/Two Pass using test_offline_two_pass | Abs Diff↓ | Sq Rel↓ | delta < 1.05↑ | Chamfer↓ | F-Score↑ | Meshes and Full Scores | |-------------------------------------------------|-----------|---------|---------------|----------|----------|-------------------------| | SimpleRecon (Offline Tuples w/ test_no_hint ) | .0873 | .0128 | 74.12 | 5.29 | .668 | Link | | DoubleTake Small | .0631 | .0097 | 86.36 | 4.64 | .723 | Link | | DoubleTake | .0624 | .0092 | 86.64 | 4.42 | .742 | Link |

| Incremental using test_incremental | Abs Diff↓ | Sq Rel↓ | delta < 1.05↑ | Chamfer↓ | F-Score↑ | Meshes and Full Scores | |----------------------------------------|-----------|---------|---------------|----------|----------|-------------------------| | DoubleTake Small | .0825 | .0124 | 76.75 | 5.53 | .649 |Link | | DoubleTake | .0754 | .0109 | 80.29 | 5.03 | .689 |Link |

| No hint and online using test_no_hint | Abs Diff↓ | Sq Rel↓ | delta < 1.05↑ | Chamfer↓ | F-Score↑ | Meshes and Full Scores | |----------------------------------------|-----------|---------|---------------|----------|----------|-------------------------| | SimpleRecon (Online Tuples) | .0873 | .0128 | 74.12 | 5.29 | .668 | Link | | DoubleTake Small | .0938 | .0148 | 72.02 | 5.50 | .650 | Link | | DoubleTake | .0863 | .0127 | 74.64 | 5.22 | .672 | Link |

🚀 Speed

Please see the paper and supplemental material for details on runtime. We do not include the first-pass feature caching step in this code release.

🏃 Running out of the box!

We've included two scans for people to try out immediately with the code. You can download these scans from here.

Steps:

Download weights for the hero_model into the weights directory.
Download the scans and unzip them into datasets/
If you've unzipped into a different folder, modify the value for the option dataset_path in configs/data/vdr/vdr_default_offline.yaml to the base path of the unzipped vdr folder.
You should be able to run it! Something like this will work:

For offline depth estimation and fusion:

CUDA_VISIBLE_DEVICES=0 python -m doubletake.test_offline_two_pass --name doubletake_offline \
            --output_base_path $OUTPUT_PATH \
            --config_file configs/models/doubletake_model.yaml \
            --load_weights_from_checkpoint weights/doubletake_model.ckpt \
            --data_config configs/data/vdr/vdr_default_offline.yaml \
            --num_workers 8 \
            --batch_size 2 \
            --fast_cost_volume \
            --run_fusion \
            --depth_fuser custom_open3d \
            --fuse_color \
            --fusion_max_depth 3.5 \
            --fusion_resolution 0.02 \
            --trim_tsdf_using_confience \
            --extended_neg_truncation \
            --dump_depth_visualization;

This will output meshes, quick depth viz, and scores when benchmarked against LiDAR depth under OUTPUT_PATH.

This command uses vdr_default_offline.yaml which will generate a depth map for every keyframe and fuse them into a mesh. You can also use dense_offline tuples by instead using vdr_dense_offline.yaml for a depth map for every frame.

See the section below on testing and evaluation. Make sure to use the correct config flags for datasets.

💾 ScanNetv2 Dataset

We've written a quick tutorial and included modified scripts to help you with downloading and extracting ScanNetv2. You can find them at data_scripts/scannet_wrangling_scripts/

You should change the dataset_path config argument for ScanNetv2 data configs at configs/data/ to match where your dataset is.

The codebase expects ScanNetv2 to be in the following format:

dataset_path
    scans_test (test scans)
        scene0707
            scene0707_00_vh_clean_2.ply (gt mesh)
            sensor_data
                frame-000261.pose.txt

Doubletake

Install / Use

README