VisFusion
[CVPR 2023] Code for "VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos"
Install / Use
/learn @huiyu-gao/VisFusionREADME
VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos (CVPR 2023)
Project Page | Paper | Supplementary | ScanNet Test Results
<br/> <center><img src="media/scene0785_00.gif" alt=""></center> <br/>Installation
sudo apt install libsparsehash-dev
conda env create -f environment.yaml
conda activate visfusion
ScanNet Dataset
We use the same input data structure as NeuralRecon. You could download and extract ScanNet v2 dataset by following the instructions provided at http://www.scan-net.org/ or the scannet_wrangling_scripts provided by SimpleRecon.
Expected directory structure of ScanNet:
DATAROOT
└───scannet
│ └───scans
│ | └───scene0000_00
│ | └───color
│ | │ │ 0.jpg
│ | │ │ 1.jpg
│ | │ │ ...
│ | │ ...
│ └───scans_test
│ | └───scene0707_00
│ | └───color
│ | │ │ 0.jpg
│ | │ │ 1.jpg
│ | │ │ ...
│ | │ ...
| └───scannetv2_test.txt
| └───scannetv2_train.txt
| └───scannetv2_val.txt
Then generate the input fragments and the ground truth TSDFs for the training/val data split by
python tools/tsdf_fusion/generate_gt.py --data_path PATH_TO_SCANNET \
--save_name all_tsdf_9 \
--window_size 9
and for the test split by
python tools/tsdf_fusion/generate_gt.py --test \
--data_path PATH_TO_SCANNET \
--save_name all_tsdf_9 \
--window_size 9
Example data
We provide an example ScanNet scene (scene0785_00) to quickly try out the code. Download it from here and unzip it into the main directory of the project code.
The reconstructed meshes will be saved to PROJECT_PATH/results.
python main.py --cfg ./config/test.yaml \
SCENE scene0785_00 \
TEST.PATH ./example_data/ScanNet \
LOGDIR: ./checkpoints \
LOADCKPT pretrained/model_000049.ckpt
By default, it will output double layer meshes (for NeuralRecon's evaluation). Set MODEL.SINGLE_LAYER_MESH=True to directly output single layer meshes for TransformerFusion's evaluation.
python main.py --cfg ./config/test.yaml \
SCENE scene0785_00 \
TEST.PATH ./example_data/ScanNet \
LOGDIR: ./checkpoints \
LOADCKPT pretrained/model_000049.ckpt \
MODEL.SINGLE_LAYER_MESH True
Training
Change TRAIN.PATH to your own data path in config/train.yaml and start training by running ./train.sh.
train.sh:
#!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES=0
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 20 MODEL.FUSION.FUSION_ON False
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 41
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 44 TRAIN.FINETUNE_LAYER 0 MODEL.PASS_LAYERS 0
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 47 TRAIN.FINETUNE_LAYER 1 MODEL.PASS_LAYERS 1
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 50 TRAIN.FINETUNE_LAYER 2 MODEL.PASS_LAYERS 2
The training is seperated to five phases:
-
Phase 1 (epoch 1 - 20), train single fragments.
MODEL.FUSION.FUSION_ON=False -
Phase 2 (epoch 21 - 41), train the whole model with GRUFusion.
-
Phase 3 (epoch 42 - 44), finetune the first layer with GRUFusion.
TRAIN.FINETUNE_LAYER=0,MODEL.PASS_LAYERS=0 -
Phase 4 (epoch 45 - 47), finetune the second layer with GRUFusion.
TRAIN.FINETUNE_LAYER=1,MODEL.PASS_LAYERS=1 -
Phase 5 (epoch 48 - 50), finetune the third layer with GRUFusion.
TRAIN.FINETUNE_LAYER=2,MODEL.PASS_LAYERS=2
Test
Change TEST.PATH to your own data path in config/test.yaml and start testing by running
python main.py --cfg ./config/test.yaml
Evaluation
We use NeuralRecon's evaluation for our main results.
python tools/evaluation.py --model ./results/scene_scannet_checkpoints_fusion_eval_49 --n_proc 16
You could print previous evaluation results by
python tools/visualize_metrics.py --model ./results/scene_scannet_checkpoints_fusion_eval_49
Here is the 3D metrics on ScanNet generated by the provided checkpoint using NeuralRecon's evaluation:
| Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------|
| 5.6 | 10.0 | 7.80 | 0.694 | 0.537 | 0.604 |
and using TransformerFusion's evaluation (set MODEL.SINGLE_LAYER_MESH=True to output single layer meshes):
| Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------|
| 4.10 | 8.66 | 6.38 | 0.757 | 0.588 | 0.660 |
ARKit data
To try with your own data captured from ARKit, please refer to NeuralRecon's DEMO.md for more details.
python test_scene.py --cfg ./config/test_scene.yaml \
DATASET ARKit \
TEST.PATH ./example_data/ARKit_scan \
LOADCKPT pretrained/model_000049.ckpt
Citation
If you find our work useful in your research please consider citing our paper:
@inproceedings{gao2023visfusion,
title={VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos},
author={Gao, Huiyu and Mao, Wei and Liu, Miaomiao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={17317--17326},
year={2023}
}
Acknowledgment
This repository is partly based on the repo NeuralRecon. Many thanks to Jiaming Sun for the great code!
Related Skills
qqbot-channel
351.8kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
351.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
project-overview
FlightPHP Skeleton Project Instructions This document provides guidelines and best practices for structuring and developing a project using the FlightPHP framework. Instructions for AI Coding A
