FindTrack

This is the official PyTorch implementation of our paper:

Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation, ICCVW 2025
Suhwan Cho*, Seunghoon Lee*, Minhyeok Lee, Jungho Lee, Sangyoun Lee
Link: [ICCVW] [arXiv]

You can also explore other related works at awesome-video-object segmentation.

Demo Video

https://github.com/user-attachments/assets/e5475442-f2fe-4899-84dd-8ae7ef22a7f2

Abstract

Existing referring VOS methods typically fuse visual and textual features in a highly entangled manner, processing multi-modal information jointly. However, this entanglement often leads to challenges in resolving ambiguous target identification and maintaining consistent mask propagation across frames. To address these issues, we propose a decoupled framework that explicitly separates object identification from mask propagation. The key frame is adaptively selected based on segmentation confidence and vision-text alignment, establishing a reliable anchor for propagation.

Setup

1. Download the datasets: Ref-YouTube-VOS, Ref-DAVIS17, MeViS.

2. Download Alpha-CLIP weights and place it in the weights/ directory.

Running

Training (optional)

FindTrack works well in a training-free manner, but fine-tuning on specific datasets can improve performance further.

For Ref-YouTube-VOS dataset:

deepspeed --num_gpus 4 train_ytvos.py

For MeViS dataset:

deepspeed --num_gpus 4 train_mevis.py

Testing

For Ref-YouTube-VOS dataset:

python run_ytvos.py

For MeViS dataset:

python run_mevis.py

Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Pre-trained model path

Gradio Demo

You can use the web demo with your own video!

Run the Gradio demo with:

python demo.py

Attachments

Pre-computed results

Contact

Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:

E-mail: suhwanx@gmail.com

FindTrack

Install / Use

README

FindTrack

Demo Video

Abstract

Setup

Running

Training (optional)

Testing

Gradio Demo

Attachments

Contact