SkillAgentSearch skills...

SOTS

Single object tracking and segmentation.

Install / Use

/learn @JudasDie/SOTS
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Single/Multiple Object Tracking and Segmentation

Codes and comparison of recent single/multiple object tracking and segmentation.

News

:boom: VLT_SCAR/VLT_TT is accepted by NeurIPS2022.

:boom: CNNInMo/TransInMo is accepted by IJCAI2022.

:boom: CSTrack is accepted by IEEE TIP.

:boom: OMC is accepted by AAAI2022. The training and testing code has been released in this codebase.

:boom: AutoMatch is accepted by ICCV2021. The training and testing code has been released in this codebase.

:boom: CSTrack ranks 5/4000 at Tianchi Global AI Competition.

:boom: Ocean is accepted by ECCV2020. [OceanPlus] is accepted by IEEE TIP.

:boom: SiamDW is accepted by CVPR2019 and selected as oral presentation.

<!-- :boom: The improved version of [CSTrack_panda](https://github.com/JudasDie/SOTS/blob/master/lib/tutorial/CSTrack_panda/CSTrack_PANDA.md) has been released, containing the end-to-end tranining codes on PANDA. It is a strong baseline for [Gigavison](http://gigavision.cn/index.html) MOT tracking. Our tracker takes the **5th** place in **Tianchi Global AI Competition (天池—全球人工智能技术创新大赛[赛道二])**, with the score of **A-0.6712/B-0.6251 (AB榜)**, which surprisingly outperforms the baseline tracker JDE with score of A-0.32/B-0.34. More details about CSTrack_panda can be found [here](https://blog.csdn.net/qq_34919792/article/details/116792954?spm=1001.2014.3001.5501). --> <!-- [![MOT Tracking on Panda](https://res.cloudinary.com/marcomontalbano/image/upload/v1622981850/video_to_markdown/images/youtube--zRCRgsrW71s-c05b58ac6eb4c4700831b2b3070cd403.jpg)](https://www.youtube.com/watch?v=zRCRgsrW71s "") -->

Supported Trackers (SOT and MOT)

Single-Object Tracking (SOT)

Multi-Object Tracking (MOT)

Results Comparison

Branches

  • SOT (or master): for our SOT trackers
  • MOT: for our MOT trackers
  • v0: old codebase supporting OceanPlus and TensorRT testing.

Please clone the branch to your needs.

Structure

  • experiments: training and testing settings
  • demo: figures for readme
  • dataset: testing dataset
  • data: training dataset
  • lib: core scripts for all trackers
  • snapshot: pre-trained models
  • pretrain: models trained on ImageNet (for training)
  • tracking: training and testing interface
$SOTS
|—— experimnets
|—— lib
|—— snapshot
  |—— xxx.model
|—— dataset
  |—— VOT2019.json 
  |—— VOT2019
     |—— ants1...
  |—— VOT2020
     |—— ants1...
|—— ...

Performance

| <sub>Model</br></sub> | <sub>OTB2015</br> </sub> | <sub>GOT10K</br> </sub> | <sub>LaSOT</br> </sub> | <sub>TNL2K</br></sub> | <sub>TrackingNet</br></sub> | <sub>NFS30</br> </sub> | <sub>TOTB</sub> |<sub>VOT2019</sub> |<sub>TC128</sub> |<sub>UAV123</sub> |<sub>LaSOT_Ext</sub> |<sub>OTB-99-LANG</sub> | |:-----:|:-:|:----:|:------:|:--------:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:| | <sub>SiamDW<sub> | <sub>0.670</sub> | <sub>0.429</sub> | <sub>0.386</sub>|<sub>0.348</sub>|<sub>61.1</sub>| <sub>0.521</sub> |<sub>0.500</sub> |<sub>0.241</sub> |<sub>0.583</sub> |<sub>0.536</sub> |<sub>-</sub> |<sub>-</sub> | | <sub>Ocean</sub> | <sub>0.676</sub> | <sub>0.615</sub> | <sub>0.517</sub>|<sub>0.421</sub>|<sub>69.2</sub>| <sub>0.553</sub> |<sub>0.638</sub> |<sub>0.323</sub> |<sub>0.585</sub> |<sub>0.621</sub> |<sub>-</sub> |<sub>-</sub> | | <sub>AutoMatch</sub> | <sub>0.714</sub> | <sub>0.652</sub> | <sub>0.583</sub>|<sub>0.472</sub>|<sub>76.0</sub>| <sub>0.606</sub> |<sub>0.668</sub> |<sub>0.322</sub> |<sub>0.634</sub> |<sub>0.644</sub> |<sub>-</sub> |<sub>-</sub> | | <sub>CNNInMo</sub> | <sub>0.703</sub> | <sub>-</sub> | <sub>0.539</sub>|<sub>0.422</sub>|<sub>72.1</sub>| <sub>0.560</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>0.629</sub> |<sub>-</sub> |<sub>-</sub> | | <sub>TransInMo</sub> | <sub>0.711</sub> | <sub>-</sub> | <sub>0.657</sub>|<sub>0.520</sub>|<sub>81.7</sub>| <sub>0.668</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>0.690</sub> |<sub>-</sub> |<sub>-</sub> | | <sub>VLT_SCAR</sub> | <sub>-</sub> | <sub>0.610</sub> | <sub>0.639</sub>|<sub>0.498</sub>|<sub>-</sub>| <sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>0.447</sub> |<sub>0.739</sub> | | <sub>VLT_TT</sub> | <sub>-</sub> | <sub>0.694</sub> | <sub>0.673</sub>|<sub>0.531</sub>|<sub>-</sub>| <sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>-</sub> |<sub>0.484</sub> |<sub>0.764</sub> |

Tracker Details

VLT_SCAR/VLT_TT [NeurIPS2022]

[Paper] [Raw Results] [Training and Testing Tutorial] <br/> VLT explores a different path to achieve SOTA tracking without complex Transformer, i.e., multimodal Vision-Language tracking. The essence is a unified-adaptive Vision-Language representation, learned by the proposed ModaMixer and asymmetrical networks. The experiments show our approach surprisingly boosts a pure CNN-based Siamese tracker to achieve competitive or even better performances compared to recent SOTAs, which also benefits Transformer-based trackers. We hope that this work inspires more possibilities for future tracking beyond Transformer.

<img src="https://github.com/JudasDie/SOTS/blob/SOT/demo/VLT.jpg" width="700" alt="VLT"/><br/>

CNNInMo/TransInMo [IJCAI2022]

[Paper] [Raw Results] [Training and Testing Tutorial] <br/> CNNInMo/TransInMo introduces a novel mechanism that conducts branch-wise interactions inside the visual tracking backbone network (InBN) via the proposed general interaction modeler (GIM). We show that both CNN and Transformer backbones can benefit from InBN, with which more robust feature representation can be learned. Our method achieves compelling tracking performance by applying the backbones to Siamese tracking.

<img src="https://github.com/JudasDie/SOTS/blob/SOT/demo/TransInMo.jpg" width="700" alt="TransInMo"/><br/>

OMC [AAAI2022]

[Paper] [Training and Testing Tutorial] <br/> OMC introduces a double-check mechanism to make the "fake background" be tracked again. Specifically, we design a re-check network as the auxiliary to initial detections. If the target does not exist in the first-check predictions (i.e., the results of object detector), as a potential misclassified target, it has a chance to be restored by the re-check network, which searches targets through mining temporal cues. Note that, the re-check network innovatively expands the role of ID embedding from data association to motion forecasting by effectively propagating previous tracklets to the current frame with a small overhead. Even with multiple tracklets, our re-check network can still propagate with one forward pass by a simple matrix multiplication. Building on a strong baseline CSTrack, we construct a new one-shot tracker and achieve favorable gains.

<img src="https://github.com/JudasDie/SOTS/blob/MOT/demo/OMC.jpg" height="500" alt="OMC"/><br/>

AutoMatch [ICCV2021]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo] <br/> AutoMatch replaces the essence of Siamese tracking, i.e. the cross-correlation and its variants, to a learnable matching network. The underlying motivation is that heuristic matching network design relies heavily on expert experience. Moreover, we experimentally find that one sole matching operator is difficult to guarantee stable tracking in all challenging environments. In this work, we introduce six novel matching operators from the perspective of feature fusion instead of explicit similarity learning, namely Concatenation, Pointwise-Addition, Pairwise-Relation, FiLM, Simple-Transformer and Transductive-Guidance, to explore more feasibility on matching operator selection. The analyses reveal these operators' selective adaptability on different environment d

Related Skills

View on GitHub
GitHub Stars514
CategoryDevelopment
Updated13d ago
Forks78

Languages

Python

Security Score

80/100

Audited on Mar 10, 2026

No findings