<h1 align="left"> <img src="assets/InfiniteVGGT_Logo.jpg" alt="Logo" height="40px" style="vertical-align: middle;"> InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams </h1> <div align="center"> <a> <img src="assets/autolab_logo.png" alt="Autolab Logo" height="50" align="middle"> </a>    <a href="https://github.com/Henryyuan429">Shuai Yuan,</a>1   <a href="https://github.com/YantaiYang-05">Yantai Yang,</a>1, 2   <a>Xiaotian Yang,</a>1   <a>Xupeng Zhang,</a>1   <a>Zhonghao Zhao,</a>1   <a>Lingming Zhang,</a>   <a href="https://zhipengzhang.cn/">Zhipeng Zhang</a>1 ✉   1<a>AutoLab, School of Artificial Intelligence, Shanghai Jiao Tong University</a>   2<a>Anyverse Dynamics</a> ✉ Corresponding Author </div> <a href="https://arxiv.org/abs/2601.02281v1"><img src="https://img.shields.io/badge/arXiv-InfiniteVGGT-red?logo=arxiv" alt="Paper PDF"></a> <a href="https://huggingface.co/papers/2601.02281"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging_Face-InfiniteVGGT-yellow" alt="Hugging Face"></a> <img src="assets/InfiniteVGGT.gif" width="70%"> Achieving higher reconstruction quality and more accurate camera pose estimation using thousands of frames input.

📰 News

[Jan 6 , 2026] Paper release.
[Jan 6 , 2026] Code release.
[Jan 19 , 2026] Long3D dataset release.

🔍 Recommendation

Welcome to check out our previous collaborative work FastVGGT.

📖 Overview

We propose InfiniteVGGT, a causal visual geometry transformer that utilizes a training-free rolling memory mechanism to enable stable, infinite-horizon streaming, and introduce the Long3D benchmark to rigorously evaluate long-term continuous 3D geometry performance. Our main contributions are summarized as follows:

An unbounded memory architecture InfiniteVGGT for continuous 3D geometry understanding, built on a novel, dynamic, and interpretable explicit memory system.
State-of-the-art performance on long-sequence benchmarks and a unique capability for robust, infinite-horizon reconstruction without memory overflow.
The Long3D benchmark, a new dataset for the rigorous evaluation of long-term performance, addressing a critical gap in the field.

🌍 Installation

Clone InfiniteVGGT

git clone https://github.com/AutoLab-SAI-SJTU/InfiniteVGGT.git
cd InfiniteVGGT

Create conda environment

conda create -n infinitevggt python=3.11 cmake=3.14.0
conda activate infinitevggt

Install requirements

pip install -r requirements.txt
conda install 'llvm-openmp<16'

Download the StreamVGGT pretrained checkpoint and place it to ./ckpt directory.

▶️ Run Inference

# Run on your own data
python run_inference.py --input_dir path/to/your/images_dir

# Run long sequence and store the result to directory for each frame
python run_inference.py \
    --input_dir path/to/your/images_dir \
    --frame_cache_dir path/to/your/results_perframe_dir \
    --no_cache_results

🚀 Run Demo

We provide demo code based on the NRGBD dataset. You can run it using the following command:

python demo_viser.py  \
    --seq_path path/to/nrgbd/image_sequence \
    --frame_interval 10 \
    --gt_path path/to/nrgbd/gt_camera (Optional)

🧊 Long3D Dataset

The Long3D Dataset is a benchmark designed for long-sequence 3D scene reconstruction. It provides 10Hz image streams paired with dense ground truth point clouds.

📊 Data Description

📥 Download Instructions

Option1: Hugging Face CLI:

The most efficient way to download the dataset is using the huggingface-hub CLI. Ensure you have the library installed (pip install -U huggingface_hub).

# export HF_ENDPOINT=https://hf-mirror.com
hf download --repo-type dataset \
    --resume-download AutoLab-SJTU/Long3D \
    --local-dir ./Long3D

Option2: Manual Access:

Alternatively, you can browse and download files directly from the Long3D dataset.

📋 Checklist

[ √ ] Release the Dataset.

🙏 Acknowledgement

We would like to acknowledge the following open-source projects that served as a foundation for our implementation:

DUSt3R CUT3R VGGT Point3R StreamVGGT FastVGGT TTT3R

Many thanks to these authors!

📜 Citation

If you incorporate our work into your research, please cite:

@misc{yuan2026infinitevggt,
        title={InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams}, 
        author={Shuai Yuan and Yantai Yang and Xiaotian Yang and Xupeng Zhang and Zhonghao Zhao and Lingming Zhang and Zhipeng Zhang},
        journal={arXiv preprint arXiv:2601.02281},
        year={2026}
}

InfiniteVGGT

Install / Use

README