RobustVGGT
[CVPR'26] Official implementation of "Emergent Outlier View Rejection in Visual Geometry Grounded Transformers"
Install / Use
/learn @cvlab-kaist/RobustVGGTREADME
We reveal that Visual Geometry Grounded Transformers (VGGT) has a built-in ability to detect outliers, which we leverage to perform outlier-view rejection without any fine-tuning.
What to expect:
- [x] Demo inference code
- [ ] Evaluation code
- [ ] Visualization code
Installation
Our code is developed based on pytorch 2.5.1, CUDA 12.1 and python 3.10.
We recommend using conda for installation:
conda create -n robust_vggt python=3.10
conda activate robust_vggt
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
Running Demo
To run the robust reconstruction demo with outlier rejection:
python robust_vggt.py --image-dir examples/trevi
python robust_vggt.py --image-dir examples/notredame --rej-thresh 0.3
Citation
@article{han2025emergent,
title={Emergent Outlier View Rejection in Visual Geometry Grounded Transformers},
author={Han, Jisang and Hong, Sunghwan and Jung, Jaewoo and Jang, Wooseok and An, Honggyu and Wang, Qianqian and Kim, Seungryong and Feng, Chen},
journal={arXiv preprint arXiv:2512.04012},
year={2025}
}
Acknowledgement
We thank the authors of VGGT for their excellent work and code, which served as the foundation for this project.
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
