FDTA

The official repository of the paper "From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking"

Generate Convert Improve

Install / Use

/learn @Spongebobbbbbbbb/FDTA

About this skill

Quality Score

0/100

README

[CVPR 2026]From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

TL;DR. We reveal that DETR-based end-to-end MOT suffers from overly similar object embeddings. FDTA explicitly enhances discriminativeness in this paradigm.

Teaser

📢 News

[2026] Our paper has been accepted by CVPR 2026! The code is now released. Please star this repo for updates!

🚀 Getting Started

1. Environment Setup

conda create -n FDTA python=3.12
conda activate FDTA
pip install -r requirements.txt
# Compile the Deformable Attention operator:
cd models/ops/
sh make.sh

2. Data Preparation

Please organize your datasets under ./datasets/ in the following structure. Note that depth maps are required alongside the RGB images:

datasets/
├── DanceTrack/
│   ├── train/
│   │   └── <sequence>/
│   │       ├── img1/
│   │       ├── gt/
│   │       └── depth/
│   ├── val/
│   └── test/
├── SportsMOT/
│   ├── train/
│   │   └── <sequence>/
│   │       ├── img1/
│   │       ├── gt/
│   │       └── depth/
│   ├── val/
│   └── test/
└── BFT/
    ├── train/
    │   └── <sequence>/
    │       ├── img1/
    │       ├── gt/
    │       └── depth/
    └── test/

Dataset sources:

DanceTrack: Download from the official repository.

SportsMOT: Download from the official repository or HuggingFace.

BFT: Download from Google Drive or Baidu Pan (NetTrack).

Depth maps: Generated using Video-Depth-Anything (see tools/gen_depthmaps.py). Other depth estimators with the same directory structure are also supported.

3. Pre-trained Weights

We use COCO pre-trained Deformable DETR weights for initialization, sourced from MOTIP. Download the weights and place them under ./pretrains/:

| File | Description | Download | |------|-------------|----------| | r50_deformable_detr_coco.pth | COCO pre-trained (base) | link | | r50_deformable_detr_coco_dancetrack.pth | Fine-tuned on DanceTrack | link | | r50_deformable_detr_coco_sportsmot.pth | Fine-tuned on SportsMOT | link | | r50_deformable_detr_coco_bft.pth | Fine-tuned on BFT | link |

Our trained FDTA model weights are available on HuggingFace. Download and place them under ./checkpoints/ for inference.

4. Training

All training configurations are stored in the ./configs/ folder. For example, to train on DanceTrack:

accelerate launch --num_processes=4 train.py \
  --data-root /path/to/your/datasets/ \
  --exp-name fdta_dancetrack \
  --config-path ./configs/dancetrack.yaml \
  --detr-pretrain ./pretrains/r50_deformable_detr_coco_dancetrack.pth

Replace dancetrack with sportsmot or bft to train on other datasets.

Note: If your GPU memory is less than 24GB, you can set --detr-num-checkpoint-frames 2 (< 16GB) or --detr-num-checkpoint-frames 1 (< 12GB) to reduce memory usage.

5. Inference

We support two inference modes:

submit: Generate tracker files for submission .
evaluate: Generate tracker files and compute evaluation metrics.

accelerate launch --num_processes=4 submit_and_evaluate.py \
  --data-root /path/to/your/datasets/ \
  --inference-mode evaluate \
  --config-path ./configs/dancetrack.yaml \
  --inference-model ./checkpoints/dancetrack.pth \
  --outputs-dir ./outputs/ \
  --inference-dataset DanceTrack \
  --inference-split val

accelerate launch --num_processes=4 submit_and_evaluate.py \
  --data-root /path/to/your/datasets/ \
  --inference-mode submit \
  --config-path ./configs/dancetrack.yaml \
  --inference-model ./checkpoints/dancetrack.pth \
  --outputs-dir ./outputs/ \
  --inference-dataset DanceTrack \
  --inference-split test

You can add --inference-dtype FP16 for faster inference with minimal performance loss.

Main Results

DanceTrack

| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 71.7 | 77.2 | 63.5 | 91.3 | 81.0 | | train+val | 74.4 | 80.0 | 67.0 | 92.2 | 82.7 |

SportsMOT

| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 74.2 | 78.5 | 65.5 | 93.0 | 84.1 |

BFT

| Training Data | HOTA | IDF1 | AssA | MOTA | DetA | |---------------|------|------|------|------|------| | train | 72.2 | 84.2 | 74.5 | 78.2 | 70.1 |

Acknowledgements

The code is built on top of these awesome repositories. We thank the authors for opensourcing their code.

Citation

If you find our work useful for your research, please consider citing:

@article{shao2025fdta,
  title={From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking},
  author={Shao, Yuqing and Yang, Yuchen and Yu, Rui and Li, Weilong and Guo, Xu and Yan, Huaicheng and Wang, Wei and Sun, Xiao},
  journal={arXiv preprint arXiv:2512.02392},
  year={2025}
}

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

flutter-tutor

Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d

Spongebobbbbbbbb

View profile

View on GitHub

GitHub Stars60

CategoryEducation

Updated5d ago

Forks2

Spongebobbbbbbbb/FDTA

Languages

Python

Security Score

100/100

Audited on Apr 1, 2026

No findings