FDTA
The official repository of the paper "From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking"
Install / Use
/learn @Spongebobbbbbbbb/FDTAREADME
[CVPR 2026]From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking
TL;DR. We reveal that DETR-based end-to-end MOT suffers from overly similar object embeddings. FDTA explicitly enhances discriminativeness in this paradigm.

📢 News
- [2026] Our paper has been accepted by CVPR 2026! The code is now released. Please star this repo for updates!
🚀 Getting Started
1. Environment Setup
conda create -n FDTA python=3.12
conda activate FDTA
pip install -r requirements.txt
# Compile the Deformable Attention operator:
cd models/ops/
sh make.sh
2. Data Preparation
Please organize your datasets under ./datasets/ in the following structure. Note that depth maps are required alongside the RGB images:
datasets/
├── DanceTrack/
│ ├── train/
│ │ └── <sequence>/
│ │ ├── img1/
│ │ ├── gt/
│ │ └── depth/
│ ├── val/
│ └── test/
├── SportsMOT/
│ ├── train/
│ │ └── <sequence>/
│ │ ├── img1/
│ │ ├── gt/
│ │ └── depth/
│ ├── val/
│ └── test/
└── BFT/
├── train/
│ └── <sequence>/
│ ├── img1/
│ ├── gt/
│ └── depth/
└── test/
Dataset sources:
- DanceTrack: Download from the official repository.
- SportsMOT: Download from the official repository or HuggingFace.
- BFT: Download from Google Drive or Baidu Pan (NetTrack).
- Depth maps: Generated using Video-Depth-Anything (see
tools/gen_depthmaps.py). Other depth estimators with the same directory structure are also supported.
3. Pre-trained Weights
We use COCO pre-trained Deformable DETR weights for initialization, sourced from MOTIP. Download the weights and place them under ./pretrains/:
| File | Description | Download |
|------|-------------|----------|
| r50_deformable_detr_coco.pth | COCO pre-trained (base) | link |
| r50_deformable_detr_coco_dancetrack.pth | Fine-tuned on DanceTrack | link |
| r50_deformable_detr_coco_sportsmot.pth | Fine-tuned on SportsMOT | link |
| r50_deformable_detr_coco_bft.pth | Fine-tuned on BFT | link |
Our trained FDTA model weights are available on HuggingFace. Download and place them under ./checkpoints/ for inference.
4. Training
All training configurations are stored in the ./configs/ folder. For example, to train on DanceTrack:
accelerate launch --num_processes=4 train.py \
--data-root /path/to/your/datasets/ \
--exp-name fdta_dancetrack \
--config-path ./configs/dancetrack.yaml \
--detr-pretrain ./pretrains/r50_deformable_detr_coco_dancetrack.pth
Replace dancetrack with sportsmot or bft to train on other datasets.
Note: If your GPU memory is less than 24GB, you can set
--detr-num-checkpoint-frames 2(< 16GB) or--detr-num-checkpoint-frames 1(< 12GB) to reduce memory usage.
5. Inference
We support two inference modes:
submit: Generate tracker files for submission .evaluate: Generate tracker files and compute evaluation metrics.
accelerate launch --num_processes=4 submit_and_evaluate.py \
--data-root /path/to/your/datasets/ \
--inference-mode evaluate \
--config-path ./configs/dancetrack.yaml \
--inference-model ./checkpoints/dancetrack.pth \
--outputs-dir ./outputs/ \
--inference-dataset DanceTrack \
--inference-split val
accelerate launch --num_processes=4 submit_and_evaluate.py \
--data-root /path/to/your/datasets/ \
--inference-mode submit \
--config-path ./configs/dancetrack.yaml \
--inference-model ./checkpoints/dancetrack.pth \
--outputs-dir ./outputs/ \
--inference-dataset DanceTrack \
--inference-split test
You can add
--inference-dtype FP16for faster inference with minimal performance loss.
Main Results
DanceTrack
| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 71.7 | 77.2 | 63.5 | 91.3 | 81.0 | | train+val | 74.4 | 80.0 | 67.0 | 92.2 | 82.7 |
SportsMOT
| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 74.2 | 78.5 | 65.5 | 93.0 | 84.1 |
BFT
| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 72.2 | 84.2 | 74.5 | 78.2 | 70.1 |
Acknowledgements
The code is built on top of these awesome repositories. We thank the authors for opensourcing their code.
Citation
If you find our work useful for your research, please consider citing:
@article{shao2025fdta,
title={From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking},
author={Shao, Yuqing and Yang, Yuchen and Yu, Rui and Li, Weilong and Guo, Xu and Yan, Huaicheng and Wang, Wei and Sun, Xiao},
journal={arXiv preprint arXiv:2512.02392},
year={2025}
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
