SkillAgentSearch skills...

ReconViaGen

ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation

Install / Use

/learn @GAP-LAB-CUHK-SZ/ReconViaGen
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation

<!-- <p align="center"> <a title="Website" href="https://jiahao620.github.io/reconviagen/" target="_blank" rel="noopener noreferrer" style="display: inline-block;"> <img src="https://www.obukhov.ai/img/badges/badge-website.svg"> <img width="1920" height="1080" alt="videoframe_1023" src="https://github.com/user-attachments/assets/08af0af5-5b83-477f-9d4b-a895488befbb" /> </a> </p> --> <div align="center">

Website Paper Hugging Face Space

</div>

teaser

Alpha Demo: https://huggingface.co/spaces/Stable-X/ReconViaGen. We welcome feedback on failure cases to help improve the model.


🆕 News — We've fixed some bugs in the v0.5 branch; please resynchronize with the remote repository.

🆕 News (v0.5) — Releasing the inference code of ReconViaGen-v0.5 in the v0.5 branch of this repository! Thanks for the excellent work TRELLIS.2! We have proposed an effective multi-view fusion strategy for TRELLIS.2, and then we combine ReconViaGen with TRELLIS.2 to enable the generation of high-resolution meshes and PBR materials. For details, please refer to the v0.5 branch of this repository.

<div align="center">

Demo of ReconViaGen-v0.5

Demo of ReconViaGen-v0.5

</div>

News (v0.2) — Releasing the training and inference code of ReconViaGen-v0.2 in the main branch of this repository! We have optimized the inference process. Reconstructing 16 images using ReconViaGen without refinement (app.py) consumes less than 18GB of VRAM. Reconstructing 16 images using ReconViaGen (app_fine.py) consumes less than 24GB of VRAM.

News (Community) — An unofficial implementation of ReconViaGen is released! Thanks to estheryang11 a lot!

Installation

Clone the repo:

git clone --recursive https://github.com/GAP-LAB-CUHK-SZ/ReconViaGen.git
cd ReconViaGen

Create a new conda environment named reconviagen and install the dependencies (pytorch 2.4.0 with CUDA 12.1):

. ./setup.sh --new-env --basic --xformers --flash-attn --spconv --mipgaussian --kaolin --nvdiffrast --demo

Local Demo 🤗

Run the script to reconstruct the object without refinement by:

python app.py

Run the script to reconstruct the object with refinement by:

python app_fine.py

Training

0. Data Preparation

The processed dataset can be download here. The dataset is organized as follows:

ProObjaverse-300K/
├── renders_random_env/
│   ├── shard-0000/
│   │   ├── {uid}.tar          # per-object archive
│   │   │   ├── {uid}/000.json          # camera metadata (extrinsic 4×4, intrinsic 3×3)
│   │   │   ├── {uid}/000.rgba.webp     # RGBA render, 1024×1024
│   │   │   ├── {uid}/001.json
│   │   │   ├── {uid}/001.rgba.webp
│   │   │   └── ...                     # up to ~80 views per object
│   │   └── ...
│   ├── shard-0001/
│   └── ...
└── lh-slats/
    ├── shard-0000/
    │   ├── {uid}.npz          # structured latent for the object
    │   │   ├── feats:  float32 (N, 8)       # latent features per voxel
    │   │   └── coords: uint8   (N, 3)        # voxel coordinates in [0, 63]
    │   └── ...
    ├── shard-0001/
    └── ...

Each .tar contains all rendered views for one object. The uid is shared between the render tar and the slat npz, and is used to pair them at training time. The .json camera file contains all camera pose of rendered views.

1. Training DiT of SS Stage.

Run the following code to train the flow model of SS Stage on the ProObjaverse-300K dataset:

. ./train_ss.sh

Noted that we trained the model with 8 A100 GPUs (80GB).

2. Training DiT of SLat Stage.

Run the following code to train the flow model of SLat Stage on the ProObjaverse-300K dataset:

. ./train_slat.sh

Noted that we trained the model with 8 A100 GPUs (80GB).

3. Try the checkpoint with gradio:

Run the following code to try your trained checkpoints with gradio:

python app_try.py --ss_ckpt /path_to_your_trained_ss_ckpt --slat_ckpt /path_to_your_trained_slat_ckpt

Citation

@article{chang2025reconviagen,
        title={ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation},
        author={Chang, Jiahao and Ye, Chongjie and Wu, Yushuang and Chen, Yuantao and Zhang, Yidan and Luo, Zhongjin and Li, Chenghong and Zhi, Yihao and Han, Xiaoguang},
        journal={arXiv preprint arXiv:2510.23306},
        year={2025}
}

Related Skills

View on GitHub
GitHub Stars491
CategoryDevelopment
Updated56m ago
Forks7

Languages

Python

Security Score

100/100

Audited on Mar 26, 2026

No findings