TrackNetV3
TrackNetV3 modified version (TensorRT INT8 Optimization)
Install / Use
/learn @nickluo/TrackNetV3README
TrackNetV3 (TensorRT INT8 Optimization)
Based on original version https://github.com/qaz812345/TrackNetV3.git
Please check the section 'TensorRT INT8 Optimization and Inference'
Tested on tennis videos with nvidia dedicated GPU and Jetson platform.
Introduction
We present TrackNetV3, a model composed of two core modules: trajectory prediction and rectification. The trajectory prediction module leverages an estimated background as auxiliary data to locate the shuttlecock in spite of the fluctuating visual interferences. This module also incorporates mixup data augmentation to formulate complex scenarios to strengthen the network’s robustness. Given that a shuttlecock can occasionally be obstructed, we create repair masks by analyzing the predicted trajectory, subsequently rectifying the path via inpainting. [paper]
<div align="center"> <a href="./"> <img src="./figure/NetArch.png" width="50%"/> </a> </div>Performance
- Performance on the test split of Shuttlecock Trajectory Dataset.
Installation
- Install the requirements.
pip install -r requirements.txt
Inference (Original)
-
Download the checkpoints
-
Unzip the file and place the parameter files to
ckptsunzip TrackNetV3_ckpts.zip -
Predict the label csv from the video
python predict.py --video_file test.mp4 --tracknet_file ckpts/TrackNet_best.pt --inpaintnet_file ckpts/InpaintNet_best.pt --save_dir prediction -
Predict the label csv from the video, and output a video with predicted trajectory
python predict.py --video_file test.mp4 --tracknet_file ckpts/TrackNet_best.pt --inpaintnet_file ckpts/InpaintNet_best.pt --save_dir prediction --output_video -
For large video
- Enable the
--large_videoflag to use an IterableDataset instead of the normal Dataset, which prevents memory errors. Note that this will decrease the inference speed. - Use
--max_sample_numto set the number of samples for background estimation. - Use
--video_rangeto specify the start and end seconds of the video for background estimation.
python predict.py --video_file test.mp4 --tracknet_file ckpts/TrackNet_best.pt --inpaintnet_file ckpts/InpaintNet_best.pt --save_dir prediction --large_video --video_range 324,330 - Enable the
Training
1. Prepare Dataset
- Download Shuttlecock Trajectory Dataset
- Adjust file structure:
- Merge the
ProfessionalandAmateurmatch directories into a singletraindirectory. - Rename the
Amateurmatch directories to start frommatch24throughmatch26. - Rename the
Testdirectory totest.
- Merge the
- Dataset file structure:
data
├─ train
| ├── match1/
| │ ├── csv/
| │ │ ├── 1_01_00_ball.csv
| │ │ ├── 1_02_00_ball.csv
| │ │ ├── …
| │ │ └── *_**_**_ball.csv
| │ ├── frame/
| │ │ ├── 1_01_00/
| │ │ │ ├── 0.png
| │ │ │ ├── 1.png
| │ │ │ ├── …
| │ │ │ └── *.png
| │ │ ├── 1_02_00/
| │ │ │ ├── 0.png
| │ │ │ ├── 1.png
| │ │ │ ├── …
| │ │ │ └── *.png
| │ │ ├── …
| │ │ └── *_**_**/
| │ │
| │ └── video/
| │ ├── 1_01_00.mp4
| │ ├── 1_02_00.mp4
| │ ├── …
| │ └── *_**_**.mp4
| ├── match2/
| │ ⋮
| └── match26/
├─ val
| ├── match1/
| ├── match2/
| │ ⋮
| └── match26/
└─ test
├── match1/
├── match2/
└── match3/
- Attributes in each csv files:
Frame, Visibility, X, Y - Data preprocessing
python preprocess.py - The
framedirectories and thevaldirectory will be generated after preprocessing. - Check the estimated background images in
<data_dir>/median- If available, the dataset will use the median image of the match; otherwise, it will use the median image of the rally.
- For example, you can exclude
train/match16/median.npzdue to camera angle discrepancies; therefore, the dataset will resort to the median image of the rally within match 16.
- Set the data root directory to
data_dirindataset.py.dataset.pywill generate the image mapping for each sample and cache the result in.npyfiles.- If you modify any related functions in
dataset.py, please ensure you delete these cached files.
2. Train Tracking Module
-
Train the tracking module from scratch
python train.py --model_name TrackNet --seq_len 8 --epochs 30 --batch_size 10 --bg_mode concat --alpha 0.5 --save_dir exp --verbose -
Resume training (start from the last epoch to the specified epoch)
python train.py --model_name TrackNet --epochs 30 --save_dir exp --resume_training --verbose
3. Generate Predicted Trajectories and Inpainting Masks
- Generate predicted trajectories and inpainting masks for training rectification module
- Noted that the coordinate range corresponds to the input spatial dimensions, not the size of the original image.
python generate_mask_data.py --tracknet_file ckpts/TrackNet_best.pt --batch_size 16
4. Train Rectification Module
-
Train the rectification module from scratch.
python train.py --model_name InpaintNet --seq_len 16 --epoch 300 --batch_size 32 --lr_scheduler StepLR --mask_ratio 0.3 --save_dir exp --verbose -
Resume training (start from the last epoch to the specified epoch)
python train.py --model_name InpaintNet --epochs 30 --save_dir exp --resume_training
Evaluation
-
Evaluate TrackNetV3 on test set
python generate_mask_data.py --tracknet_file ckpts/TrackNet_best.pt --split_list test python test.py --inpaintnet_file ckpts/InpaintNet_best.pt --save_dir eval -
Evaluate the tracking module on test set
python test.py --tracknet_file ckpts/TrackNet_best.pt --save_dir eval -
Generate video with ground truth label and predicted result
python test.py --tracknet_file ckpts/TrackNet_best.pt --video_file data/test/match1/video/1_05_02.mp4
TensorRT INT8 Optimization and Inference
This section describes the workflow for converting the TrackNet model to an INT8-quantized TensorRT engine and running inference with it for improved performance. This process involves three main scripts: quantize.py, build_trt.py, and predict_i8.py.
Workflow Overview:
quantize.py: Convert a pre-trained PyTorch TrackNet model (.pt) to a quantized INT8 ONNX model (.onnx). This step requires a sample video for calibration.build_trt.py: Take the generated ONNX model and build an optimized TensorRT engine (.engine). This engine is specific to your GPU hardware.predict_i8.py: Use the TensorRT engine to perform fast tracking on videos.
1. Quantization (quantize.py)
This script performs Post-Training Quantization (PTQ) on a trained TrackNet PyTorch model and exports it to the ONNX format, ready for TensorRT.
User Manual:
- Purpose: To generate an INT8 quantized ONNX model from a PyTorch checkpoint.
- Command:
python quantize.py --video_file path/to/your/calibration_video.mp4 --tracknet_file path/to/your/TrackNet_best.pt - Arguments:
--video_file: (Required) Path to a video file. This video will be used to generate calibration data for the quantization process.--tracknet_file: (Required) Path to the TrackNet PyTorch model checkpoint (e.g.,ckpts/TrackNet_best.pt).
- Outputs:
tracknet_i8.onnx: The INT8 quantized model in ONNX format.amax_calibration.pth: Saved amax values from the calibration process.
Use Cases:
- Preparing the model for INT8 inference to achieve higher throughput and lower latency, especially on NVIDIA GPUs that support INT8.
2. TensorRT Engine Building (build_trt.py)
This script takes the ONNX model (produced by quantize.py) and builds a TensorRT engine.
User Manual:
- Purpose: To convert an ONNX model into an optimized TensorRT engine. The current script is configured for an input channel size of 27 (which corresponds to
seq_len=8andbg_mode='concat'in the original model). - Command to build engine:
python build_trt.py --onnx_file tracknet_i8.onnx --trt_file trt_i8.engine - Command to test engine (optional):
python build_trt.py --trt_file trt_i8.engine --test --batch_size 1 - Arguments:
--onnx_file: Path to the input ONNX model file (default:tracknet_i8.onnx).--trt_file: Path to save the output TensorRT engine file (default:trt_i8.engine).--batch_size: Batch size to use for the optional inference test (default: 1). Only used if--testis specified.--test: If specified
Related Skills
node-connect
350.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
Security Score
Audited on Mar 26, 2026
