VICT
[CVPR 2025] Test-Time Visual In-Context Tuning
Install / Use
/learn @Jiahao000/VICTREADME
<strong>We present VICT, a test-time visual in-context tuning method that can adapt visual in-context learning models on the fly with a single test sample. VICT can be applied to a wide range of unseen domains and tasks at test time.</strong>
<div style="text-align:center"> <img src="assets/teaser.png" width="100%" height="100%"> </div>:open_book: For more results, please refer to our <a href="https://arxiv.org/abs/2503.21777" target="_blank">paper</a>
</div>
📣 News
- [03/2025] 🔥 VICT is released on arXiv.
🌟 Method
VICT is a simple yet effective test-time training approach to adapt visual in-context learning (VICL) models on the fly. The motivation is that each test input offers a hint about the test distribution. Thus, we modify a VICL model at test time to make full use of this hint by setting up a <i>one-sample learning problem</i>.
Specifically, we flip the role between the task prompts and the test sample and use a cycle consistency self-supervised loss to reconstruct the original task prompt output. Our key insight is that a model should be aware of a new test distribution if it can successfully recover the original task prompts.
<div style="text-align:center"> <img src="assets/pipeline.png" width="100%" height="100%"> </div>🤗 Qualitative Examples
Unseen Domains
Middle-/High-Level Tasks with Corruptions
<div style="text-align:center"> <img src="assets/unseen_domains_part1.png" width="100%" height="100%"> </div>Low-Level Tasks with Corruptions
<div style="text-align:center"> <img src="assets/unseen_domains_part2.png" width="100%" height="100%"> </div>Unseen Tasks
<div style="text-align:center"> <img src="assets/unseen_tasks.png" width="100%" height="100%"> </div>🛠️ Usage
Installation
See installation instructions.
Data
See data instructions.
Training
Evaluation
👨💻 Todo
- [x] Release the arXiv version.
- [x] Release the code.
📘 Citation
If you find this work useful for your research, please consider citing our paper:
@inproceedings{xie2025test,
title = {Test-Time Visual In-Context Tuning},
author = {Xie, Jiahao and Tonioni, Alessio and Rauschmayr, Nathalie and Tombari, Federico and Schiele, Bernt},
booktitle={CVPR},
year = {2025}
}
❤️ Acknowledgement
We acknowledge the use of the following public code in this project: Painter, MAE, BEiT, detectron2, Mask2Former, bts, mmcv, mmdetetection, mmpose, MIRNet, MPRNet, and Uformer.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
20.0kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
