MCTrack

[IROS2025]This is the offical implementation of the paper "MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving"

Generate Convert Improve

Install / Use

/learn @megvii-research/MCTrack

About this skill

Quality Score

0/100

README

<p align=center>MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous</p>

0. Abstract

This paper introduces MCTrack, a new 3D multi-object tracking method that achieves state-of-the-art (SOTA) performance across KITTI, nuScenes, and Waymo datasets. Addressing the gap in existing tracking paradigms, which often perform well on specific datasets but lack generalizability, MCTrack offers a unified solution. Additionally, we have standardized the format of perceptual results across various datasets, termed BaseVersion, facilitating researchers in the field of multi-object tracking (MOT) to concentrate on the core algorithmic development without the undue burden of data preprocessing. Finally, recognizing the limitations of current evaluation metrics, we propose a novel set that assesses motion information output, such as velocity and acceleration, crucial for downstream tasks.

1. News

2025-06-16. MCTrack is accepted to IROS 2025.
[🔥🔥🔥2024-10-08]. The code has been released.🙌
2024-09-24. MCTrack is released on arXiv.
2024-09-01. We rank 2nd among all methods on Waymo Dataset for MOT.
2024-08-30. We rank 1st among all methods on KITTI Dataset for MOT.
2024-08-27. We rank 1st among all methods on nuScenes Dataset for MOT.

2. Results

KITTI

online

| Method | Detector | Set | HOTA | MOTA | TP | FP | IDSW | | --- | --- | --- | --- | --- | --- | --- | --- | | MCTrack | VirConv | test | 81.07 | 89.81 | 32367 | 2025 | 46 | | MCTrack | VirConv | train | 82.65 | 85.19 | 22186 | 1659 | 22 |

offline

| Method | Detector | Set | HOTA | MOTA | TP | FP | IDSW | | --- | --- | --- | --- | --- | --- | --- | --- | | MCTrack | VirConv | test | 82.75 | 91.79 | 32095 | 2297 | 11 | | MCTrack | VirConv | train | 83.89 | 86.56 | 22150 | 1311 | 3 |

nuScenes

| Method | Detector | Set | AMOTA | MOTA | TP | FP | IDS | | --- | --- | --- | --- | --- | --- | --- | --- | | MCTrack | LargeKernel3D | test | 0.763 | 0.634 | 103327 | 19643 | 242 | | MCTrack | CenterPoint | val | 0.740 | 0.640 | 85900 | 13083 | 275 |

Waymo

| Method | Detector | Set | MOTA / L1 | MOTP / L1 | MOTA / L2 | MOTP / L2 | | --- | --- | --- | --- | --- | --- | --- | | MCTrack | CTRL | test | 0.7504 | 0.2276 | 0.7344 | 0.2278 | | MCTrack | CTRL | val | 0.7384 | 0.2288 | 0.7155 | 0.2293 |

3. Data preparation

BaseVersion Data Generation

First, you need to download the original datasets from Kitti, nuScenes, and Waymo, as well as their corresponding detection results, and organize them in the following directory structure. (Note: If you only want to test on the KITTI dataset, you only need to download the KITTI data.)

For KITTI

data/
└── kitti/
    ├── datasets/
    |    ├── testing/
    |    |    ├── calib/
    |    |    |   └── 0000.txt
    |    |    └── pose/
    |    |        └── 0000.txt
    |    └── training/
    |         ├── calib/
    |         ├── label_02/
    |         └── pose/
    └── detectors/
         ├── casa/
         │    ├── testing/
         │    │   ├── 0000/
         │    │   │   └── 000000.txt
         │    │   │   └── 000001.txt             
         │    │   └── 0001/
         │    └── testing/
         └── point_rcnn/

For nuScenes

data/
└── nuScenes/
    ├── datasets/
    |    ├── maps/
    |    ├── samples/
    |    ├── sweeps/
    |    ├── v1.0-test/
    |    └── v1.0-trainval/
    └── detectors/
         ├── centerpoint/
         |   └── val.json
         └── largekernel/
             └── test.json

For Waymo

To prepare the Waymo data, you first need to follow ImmortalTracker's instructions to extract ego_info and ts_info (we will also provide these in the link, so you might be able to skip this step.).
Follow ImmortalTracker's instructions to convert detection results into to .npz files.
Please note that we have modified the ego_info section in immortaltracker, and the updated file is provided in preprocess/ego_info.py.

data/
└── Waymo/
    ├── datasets/
    |    ├── testing/
    |    |    ├── ego_info/
    |    |    │   ├── .npz
    |    |    │   └── .npz             
    |    |    └── ts_info/
    |    |        ├── .json
    |    |        └── .json          
    |    └── validation/
    |         ├── ego_info/
    |         └── ts_info/
    └── detectors/
         └── ctrl/
              ├── testing/
              │   ├── .npz
              │   └── .npz        
              └── validation/
                  ├── .npz
                  └── .npz

Second, run the following command to generate the BaseVersion data format required for MCTrack. Certainly, if you do not wish to regenerate the data, you can directly download the data we have prepared from Google Drive and Baidu Cloud. Due to copyright issues with the Waymo dataset, we are unable to provide the corresponding converted data.
```
$ python preprocess/convert2baseversion.py --dataset kitti/nuscenes/waymo
```

Eventually, you will get the data format of baseversion in the path data/base_version/.

data/
└── base_version/
    ├── kitti/
    │   ├── casa/
    │   |   ├── test.json
    │   |   └── val.json
    │   └── virconv/
    │       ├── test.json
    │       └── val.json
    ├── nuscenes/
    |   ├── centerpoint/
    |   │   └── val.json
    |   └── largekernel/
    |        └── test.json
    └── waymo/
        └── ctrl/
            ├── val.json
            └── test.json

BaseVersion Data Format

scene-0001/
├── frame_0/
│   ├── cur_sample_token                # for nuScenes
│   ├── timestamp                       # The timestamp of each frame
│   ├── bboxes/                         # Detected bbox
│   │   ├── bbox_1/                     # Bbox1
│   │   │   ├── detection_score         # Detection score
│   │   │   ├── category                # Category
│   │   │   ├── global_xyz              # Center position of the global bbox
│   │   │   ├── global_orientation      # Orientation quaternion
│   │   │   ├── global_yaw              # Yaw
│   │   │   ├── lwh                     # Length, width, and height of the bbox
│   │   │   ├── global_velocity         # Velocity of the object in the global coordinate 
│   │   │   ├── global_acceleration     # Acceleration of the object in the global coordinate 
│   │   │   └── bbox_image/             # Information of the bbox in the image coordinate
│   │   │       ├── camera_type         # Camera position
│   │   │       └── x1y1x2y2            # Image coordinates
│   │   ├── bbox_2/
│   │   │   ├── detection_score
│   │   │   ├── category
│   │   │   └── ...
│   │   └── ...
│   └── transform_matrix/
│       ├── global2ego                 # Transformation matrix from global to ego 
│       ├── ego2lidar                  # Transformation matrix from ego to lidar
│       ├── global2lidar               # Transformation matrix from global to lidar 
│       └── cameras_transform_matrix/  # Camera-related transformation matrix
│           ├── CAM_FRONT/             # Front-view camera
│           │   ├── image_shape        # Image shape
│           │   ├── ego2camera         # Transformation matrix from ego to camera
│           │   ├── camera2image       # Transformation matrix from camera to image
│           │   ├── lidar2camera       # Transforma

Related Skills

node-connect

334.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

82.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

334.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

82.1k

Commit, push, and open a PR