VisFusion

[CVPR 2023] Code for "VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos"

Generate Convert Improve

Install / Use

/learn @huiyu-gao/VisFusion

About this skill

Quality Score

0/100

README

VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos (CVPR 2023)

Project Page | Paper | Supplementary | ScanNet Test Results

Installation

sudo apt install libsparsehash-dev
conda env create -f environment.yaml
conda activate visfusion

ScanNet Dataset

We use the same input data structure as NeuralRecon. You could download and extract ScanNet v2 dataset by following the instructions provided at http://www.scan-net.org/ or the scannet_wrangling_scripts provided by SimpleRecon.

Expected directory structure of ScanNet:

DATAROOT
└───scannet
│   └───scans
│   |   └───scene0000_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
│   └───scans_test
│   |   └───scene0707_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
|   └───scannetv2_test.txt
|   └───scannetv2_train.txt
|   └───scannetv2_val.txt

Then generate the input fragments and the ground truth TSDFs for the training/val data split by

python tools/tsdf_fusion/generate_gt.py --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

and for the test split by

python tools/tsdf_fusion/generate_gt.py --test \ 
                                        --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

Example data

We provide an example ScanNet scene (scene0785_00) to quickly try out the code. Download it from here and unzip it into the main directory of the project code.

The reconstructed meshes will be saved to PROJECT_PATH/results.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt

By default, it will output double layer meshes (for NeuralRecon's evaluation). Set MODEL.SINGLE_LAYER_MESH=True to directly output single layer meshes for TransformerFusion's evaluation.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt \ 
                MODEL.SINGLE_LAYER_MESH True

Training

Change TRAIN.PATH to your own data path in config/train.yaml and start training by running ./train.sh.

train.sh:

#!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES=0

python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 20 MODEL.FUSION.FUSION_ON False
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 41
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 44 TRAIN.FINETUNE_LAYER 0 MODEL.PASS_LAYERS 0
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 47 TRAIN.FINETUNE_LAYER 1 MODEL.PASS_LAYERS 1
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 50 TRAIN.FINETUNE_LAYER 2 MODEL.PASS_LAYERS 2

The training is seperated to five phases:

Phase 1 (epoch 1 - 20), train single fragments. MODEL.FUSION.FUSION_ON=False
Phase 2 (epoch 21 - 41), train the whole model with GRUFusion.
Phase 3 (epoch 42 - 44), finetune the first layer with GRUFusion. TRAIN.FINETUNE_LAYER=0, MODEL.PASS_LAYERS=0
Phase 4 (epoch 45 - 47), finetune the second layer with GRUFusion. TRAIN.FINETUNE_LAYER=1, MODEL.PASS_LAYERS=1
Phase 5 (epoch 48 - 50), finetune the third layer with GRUFusion. TRAIN.FINETUNE_LAYER=2, MODEL.PASS_LAYERS=2

Test

Change TEST.PATH to your own data path in config/test.yaml and start testing by running

python main.py --cfg ./config/test.yaml

Evaluation

We use NeuralRecon's evaluation for our main results.

python tools/evaluation.py --model ./results/scene_scannet_checkpoints_fusion_eval_49 --n_proc 16

You could print previous evaluation results by

python tools/visualize_metrics.py --model ./results/scene_scannet_checkpoints_fusion_eval_49

Here is the 3D metrics on ScanNet generated by the provided checkpoint using NeuralRecon's evaluation: | Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------| | 5.6 | 10.0 | 7.80 | 0.694 | 0.537 | 0.604 |

and using TransformerFusion's evaluation (set MODEL.SINGLE_LAYER_MESH=True to output single layer meshes): | Acc ↓| Comp ↓ | Chamfer ↓| Prec ↑ | Recall ↑ | F-Score↑ |
|----------|----------|----------|----------|----------|----------| | 4.10 | 8.66 | 6.38 | 0.757 | 0.588 | 0.660 |

ARKit data

To try with your own data captured from ARKit, please refer to NeuralRecon's DEMO.md for more details.

python test_scene.py --cfg ./config/test_scene.yaml \ 
                     DATASET ARKit \ 
                     TEST.PATH ./example_data/ARKit_scan \ 
                     LOADCKPT pretrained/model_000049.ckpt

Citation

If you find our work useful in your research please consider citing our paper:

@inproceedings{gao2023visfusion,
  title={VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos},
  author={Gao, Huiyu and Mao, Wei and Liu, Miaomiao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={17317--17326},
  year={2023}
}

Acknowledgment

This repository is partly based on the repo NeuralRecon. Many thanks to Jiaming Sun for the great code!

Related Skills

qqbot-channel

351.8k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

100.6k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

351.8k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

project-overview

FlightPHP Skeleton Project Instructions This document provides guidelines and best practices for structuring and developing a project using the FlightPHP framework. Instructions for AI Coding A

huiyu-gao

View profile

View on GitHub

GitHub Stars188

CategoryContent

Updated8h ago

Forks10

huiyu-gao/VisFusion

Languages

Python

Security Score

100/100

Audited on Apr 8, 2026

No findings