SkillAgentSearch skills...

VisFusion

[CVPR 2023] Code for "VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos"

Install / Use

/learn @huiyu-gao/VisFusion
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos (CVPR 2023)

Project Page | Paper | Supplementary | ScanNet Test Results

<br/> <center><img src="media/scene0785_00.gif" alt=""></center> <br/>

Installation

sudo apt install libsparsehash-dev
conda env create -f environment.yaml
conda activate visfusion

ScanNet Dataset

We use the same input data structure as NeuralRecon. You could download and extract ScanNet v2 dataset by following the instructions provided at http://www.scan-net.org/ or the scannet_wrangling_scripts provided by SimpleRecon.

Expected directory structure of ScanNet:

DATAROOT
└───scannet
│   └───scans
│   |   └───scene0000_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
│   └───scans_test
│   |   └───scene0707_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
|   └───scannetv2_test.txt
|   └───scannetv2_train.txt
|   └───scannetv2_val.txt

Then generate the input fragments and the ground truth TSDFs for the training/val data split by

python tools/tsdf_fusion/generate_gt.py --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

and for the test split by

python tools/tsdf_fusion/generate_gt.py --test \ 
                                        --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

Example data

We provide an example ScanNet scene (scene0785_00) to quickly try out the code. Download it from here and unzip it into the main directory of the project code.

The reconstructed meshes will be saved to PROJECT_PATH/results.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt

By default, it will output double layer meshes (for NeuralRecon's evaluation). Set MODEL.SINGLE_LAYER_MESH=True to directly output single layer meshes for TransformerFusion's evaluation.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt \ 
                MODEL.SINGLE_LAYER_MESH True

Training

Change TRAIN.PATH to your own data path in config/train.yaml and start training by running ./train.sh.

train.sh:

#!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES=0

python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 20 MODEL.FUSION.FUSION_ON False
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 41
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 44 TRAIN.FINETUNE_LAYER 0 MODEL.PASS_LAYERS 0
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 47 TRAIN.FINETUNE_LAYER 1 MODEL.PASS_LAYERS 1
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 50 TRAIN.FINETUNE_LAYER 2 MODEL.PASS_LAYERS 2

The training is seperated to five phases:

  • Phase 1 (epoch 1 - 20), train single fragments. MODEL.FUSION.FUSION_ON=False

  • Phase 2 (epoch 21 - 41), train the whole model with GRUFusion.

  • Phase 3 (epoch 42 - 44), finetune the first layer with GRUFusion. TRAIN.FINETUNE_LAYER=0, MODEL.PASS_LAYERS=0

  • Phase 4 (epoch 45 - 47), finetune the second layer with GRUFusion. TRAIN.FINETUNE_LAYER=1, MODEL.PASS_LAYERS=1

  • Phase 5 (epoch 48 - 50), finetune the third layer with GRUFusion. TRAIN.FINETUNE_LAYER=2, MODEL.PASS_LAYERS=2

Test

Change TEST.PATH to your own data path in config/test.yaml and start testing by running

python main.py --cfg ./config/test.yaml

Evaluation

We use NeuralRecon's evaluation for our main results.

python tools/evaluation.py --model ./results/scene_scannet_checkpoints_fusion_eval_49 --n_proc 16

You could print previous evaluation results by

python tools/visualize_metrics.py --model ./results/scene_scannet_checkpoints_fusion_eval_49

Here is the 3D metrics on ScanNet generated by the provided checkpoint using NeuralRecon's evaluation: | Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------| | 5.6 | 10.0 | 7.80 | 0.694 | 0.537 | 0.604 |

and using TransformerFusion's evaluation (set MODEL.SINGLE_LAYER_MESH=True to output single layer meshes): | Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------| | 4.10 | 8.66 | 6.38 | 0.757 | 0.588 | 0.660 |

ARKit data

To try with your own data captured from ARKit, please refer to NeuralRecon's DEMO.md for more details.

python test_scene.py --cfg ./config/test_scene.yaml \ 
                     DATASET ARKit \ 
                     TEST.PATH ./example_data/ARKit_scan \ 
                     LOADCKPT pretrained/model_000049.ckpt

Citation

If you find our work useful in your research please consider citing our paper:

@inproceedings{gao2023visfusion,
  title={VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos},
  author={Gao, Huiyu and Mao, Wei and Liu, Miaomiao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={17317--17326},
  year={2023}
}

Acknowledgment

This repository is partly based on the repo NeuralRecon. Many thanks to Jiaming Sun for the great code!

Related Skills

View on GitHub
GitHub Stars188
CategoryContent
Updated8h ago
Forks10

Languages

Python

Security Score

100/100

Audited on Apr 8, 2026

No findings