ByteTrack

[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Generate Convert Improve

Install / Use

/learn @FoundationVision/ByteTrack

About this skill

Quality Score

0/100

README

ByteTrack

ByteTrack is a simple, fast and strong multi-object tracker.

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, Xinggang Wang

arXiv 2110.06864

Demo Links

| Google Colab Demo | Huggingface Demo | YouTube Tutorial | Original Paper: ByteTrack | |:-----------------:|:----------------:|:---------------------------------------------------:|:-------------------------:| ||||arXiv 2110.06864 |

Integrated to Huggingface Spaces with Gradio.

Abstract

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods obtain identities by associating detection boxes whose scores are higher than a threshold. The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating every detection box instead of only the high score ones. For the low score detection boxes, we utilize their similarities with tracklets to recover true objects and filter out the background detections. When applied to 9 different state-of-the-art trackers, our method achieves consistent improvement on IDF1 scores ranging from 1 to 10 points. To put forwards the state-of-the-art performance of MOT, we design a simple and strong tracker, named ByteTrack. For the first time, we achieve 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU.

News

(2022.07) Our paper is accepted by ECCV 2022!
(2022.06) A nice re-implementation by Baidu PaddleDetection!

Tracking performance

Results on MOT challenge test set

| Dataset | MOTA | IDF1 | HOTA | MT | ML | FP | FN | IDs | FPS | |------------|-------|------|------|-------|-------|------|------|------|------| |MOT17 | 80.3 | 77.3 | 63.1 | 53.2% | 14.5% | 25491 | 83721 | 2196 | 29.6 | |MOT20 | 77.8 | 75.2 | 61.3 | 69.2% | 9.5% | 26249 | 87594 | 1223 | 13.7 |

Visualization results on MOT challenge test set

Installation

1. Installing on the host machine

Step1. Install ByteTrack.

git clone https://github.com/ifzhang/ByteTrack.git
cd ByteTrack
pip3 install -r requirements.txt
python3 setup.py develop

Step2. Install pycocotools.

pip3 install cython; pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Step3. Others

pip3 install cython_bbox

2. Docker build

docker build -t bytetrack:latest .

# Startup sample
mkdir -p pretrained && \
mkdir -p YOLOX_outputs && \
xhost +local: && \
docker run --gpus all -it --rm \
-v $PWD/pretrained:/workspace/ByteTrack/pretrained \
-v $PWD/datasets:/workspace/ByteTrack/datasets \
-v $PWD/YOLOX_outputs:/workspace/ByteTrack/YOLOX_outputs \
-v /tmp/.X11-unix/:/tmp/.X11-unix:rw \
--device /dev/video0:/dev/video0:mwr \
--net=host \
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR \
-e DISPLAY=$DISPLAY \
--privileged \
bytetrack:latest

Data preparation

Download MOT17, MOT20, CrowdHuman, Cityperson, ETHZ and put them under <ByteTrack_HOME>/datasets in the following structure:

datasets
   |——————mot
   |        └——————train
   |        └——————test
   └——————crowdhuman
   |         └——————Crowdhuman_train
   |         └——————Crowdhuman_val
   |         └——————annotation_train.odgt
   |         └——————annotation_val.odgt
   └——————MOT20
   |        └——————train
   |        └——————test
   └——————Cityscapes
   |        └——————images
   |        └——————labels_with_ids
   └——————ETHZ
            └——————eth01
            └——————...
            └——————eth07

Then, you need to turn the datasets to COCO format and mix different training data:

cd <ByteTrack_HOME>
python3 tools/convert_mot17_to_coco.py
python3 tools/convert_mot20_to_coco.py
python3 tools/convert_crowdhuman_to_coco.py
python3 tools/convert_cityperson_to_coco.py
python3 tools/convert_ethz_to_coco.py

Before mixing different datasets, you need to follow the operations in mix_xxx.py to create a data folder and link. Finally, you can mix the training data:

cd <ByteTrack_HOME>
python3 tools/mix_data_ablation.py
python3 tools/mix_data_test_mot17.py
python3 tools/mix_data_test_mot20.py

Model zoo

Ablation model

Train on CrowdHuman and MOT17 half train, evaluate on MOT17 half val

| Model | MOTA | IDF1 | IDs | FPS | |------------|-------|------|------|------| |ByteTrack_ablation [google], [baidu(code:eeo8)] | 76.6 | 79.3 | 159 | 29.6 |

MOT17 test model

Train on CrowdHuman, MOT17, Cityperson and ETHZ, evaluate on MOT17 train.

Standard models

| Model | MOTA | IDF1 | IDs | FPS | |------------|-------|------|------|------| |bytetrack_x_mot17 [google], [baidu(code:ic0i)] | 90.0 | 83.3 | 422 | 29.6 | |bytetrack_l_mot17 [google], [baidu(code:1cml)] | 88.7 | 80.7 | 460 | 43.7 | |bytetrack_m_mot17 [google], [baidu(code:u3m4)] | 87.0 | 80.1 | 477 | 54.1 | |bytetrack_s_mot17 [google], [baidu(code:qflm)] | 79.2 | 74.3 | 533 | 64.5 |

Light models

| Model | MOTA | IDF1 | IDs | Params(M) | FLOPs(G) | |------------|-------|------|------|------|-------| |bytetrack_nano_mot17 [google], [baidu(code:1ub8)] | 69.0 | 66.3 | 531 | 0.90 | 3.99 | |bytetrack_tiny_mot17 [google], [baidu(code:cr8i)] | 77.1 | 71.5 | 519 | 5.03 | 24.45 |

MOT20 test model

Train on CrowdHuman and MOT20, evaluate on MOT20 train.

| Model | MOTA | IDF1 | IDs | FPS | |------------|-------|------|------|------| |bytetrack_x_mot20 [google], [baidu(code:3apd)] | 93.4 | 89.3 | 1057 | 17.5 |

Training

The COCO pretrained YOLOX model can be downloaded from their model zoo. After downloading the pretrained models, you can put them under <ByteTrack_HOME>/pretrained.

Train ablation model (MOT17 half train and CrowdHuman)

cd <ByteTrack_HOME>
python3 tools/train.py -f exps/example/mot/yolox_x_ablation.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth

Train MOT17 test model (MOT17 train, CrowdHuman, Cityperson and ETHZ)

cd <ByteTrack_HOME>
python3 tools/train.py -f exps/example/mot/yolox_x_mix_det.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth

Train MOT20 test model (MOT20 train, CrowdHuman)

For MOT20, you need to clip the bounding boxes inside the image.

Add clip operation in line 134-135 in data_augment.py, line 122-125 in mosaicdetection.py, [line 217-225 in mo

Related Skills

tmux

331.2k

Remote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.

blogwatcher

331.2k

Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.

prd

Raito Bitcoin ZK client web portal.

Unla

2.1k

🧩 MCP Gateway - A lightweight gateway service that instantly transforms existing MCP Servers and APIs into MCP servers with zero code changes. Features Docker deployment and management UI, requiring no infrastructure modifications.

FoundationVision

View profile

View on GitHub

GitHub Stars6.2k

CategoryOperations

Updated4h ago

Forks1.1k

FoundationVision/ByteTrack

Languages

Python

Security Score

100/100

Audited on Mar 23, 2026

No findings