MaGNet
[CVPR 2022 Oral] Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
Install / Use
/learn @baegwangbin/MaGNetREADME
Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
Official implementation of the paper
<p align="center"> <img width=100% src="https://github.com/baegwangbin/MaGNet/blob/master/figs/method.png?raw=true?raw=true"> </p>Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
CVPR 2022 [oral]
We present MaGNet (Monocular and Geometric Network), a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects.
Datasets
We evaluated MaGNet on ScanNet, 7-Scenes and KITTI
ScanNet
- In order to download ScanNet, you should submit an agreement to the Terms of Use. Please follow the instructions in this link.
- The folder should be organized as
/path/to/ScanNet
/path/to/ScanNet/scans
/path/to/ScanNet/scans/scene0000_00 ...
/path/to/ScanNet/scans_test
/path/to/ScanNet/scans_test/scene0707_00 ...
7-Scenes
- Download all seven scenes (Chess, Fire, Heads, Office, Pumpkin, RedKitchen, Stairs) from this link.
- The folder should be organized as:
/path/to/SevenScenes
/path/to/SevenScenes/chess ...
KITTI
- Download raw data from this link.
- Download depth maps from this link
- The folder should be organized as:
/path/to/KITTI
/path/to/KITTI/rawdata
/path/to/KITTI/rawdata/2011_09_26 ...
/path/to/KITTI/train
/path/to/KITTI/train/2011_09_26_drive_0001_sync ...
/path/to/KITTI/val
/path/to/KITTI/val/2011_09_26_drive_0002_sync ...
Download model weights
Download model weights by
python ckpts/download.py
If some files are not downloaded properly, download them manually from this link and place the files under ./ckpts.
Install dependencies
We recommend using a virtual environment.
python3.6 -m venv --system-site-packages ./venv
source ./venv/bin/activate
Install the necessary dependencies by
python3.6 -m pip install -r requirements.txt
Test scripts
If you wish to evaluate the accuracy of our D-Net (single-view), run
python test_DNet.py ./test_scripts/dnet/scannet.txt
python test_DNet.py ./test_scripts/dnet/7scenes.txt
python test_DNet.py ./test_scripts/dnet/kitti_eigen.txt
python test_DNet.py ./test_scripts/dnet/kitti_official.txt
You should get the following results:
|Dataset|abs_rel|abs_diff|sq_rel|rmse|rmse_log|irmse|log_10|silog|a1|a2|a3|NLL| |-|-|-|-|-|-|-|-|-|-|-|-|-| |ScanNet|0.1186|0.2070|0.0493|0.2708|0.1461|0.1086|0.0515|10.0098|0.8546|0.9703|0.9928|2.2352| |7-Scenes|0.1339|0.2209|0.0549|0.2932|0.1677|0.1165|0.0566|12.8807|0.8308|0.9716|0.9948|2.7941| |KITTI (eigen)|0.0605|1.1331|0.2086|2.4215|0.0921|0.0075|0.0261|8.4312|0.9602|0.9946|0.9989|2.6443| |KITTI (official)|0.0629|1.1682|0.2541|2.4708|0.1021|0.0080|0.0270|9.5752|0.9581|0.9905|0.9971|1.7810|
In order to evaluate the accuracy of the full pipeline (multi-view), run
python test_MaGNet.py ./test_scripts/magnet/scannet.txt
python test_MaGNet.py ./test_scripts/magnet/7scenes.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_eigen.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_official.txt
You should get the following results:
|Dataset|abs_rel|abs_diff|sq_rel|rmse|rmse_log|irmse|log_10|silog|a1|a2|a3|NLL| |-|-|-|-|-|-|-|-|-|-|-|-|-| |ScanNet|0.0810|0.1466|0.0302|0.2098|0.1101|0.1055|0.0351|8.7686|0.9298|0.9835|0.9946|0.1454| |7-Scenes|0.1257|0.2133|0.0552|0.2957|0.1639|0.1782|0.0527|13.6210|0.8552|0.9715|0.9935|1.5605| |KITTI (eigen)|0.0535|0.9995|0.1623|2.1584|0.0826|0.0566|0.0235|7.4645|0.9714|0.9958|0.9990|1.8053| |KITTI (official)|0.0503|0.9135|0.1667|1.9707|0.0848|0.2423|0.0219|7.9451|0.9769|0.9941|0.9979|1.4750|
Training scripts
If you wish to train the models, run
python train_DNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt
python train_FNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt
python train_MaGNet.py ./test_scripts/dnet/{scannet, kitti_eigen, kitti_official}.txt
Note that the dataset_path argument in the script .txt files should be modified
Citation
If you find our work useful in your research please consider citing our paper:
@InProceedings{Bae2022,
title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry}
author = {Gwangbin Bae and Ignas Budvytis and Roberto Cipolla},
booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
last30days-skill
5.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
