SkillAgentSearch skills...

HTR

[TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memory

Install / Use

/learn @bo-miao/HTR
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

License arXiv IEEE <--- Paper Link

PWC PWC PWC

The official implementation of the paper:

<div align="center"> <h1> <b> Temporally Consistent Referring Video Object Segmentation with Hybrid Memory </b> </h1> </div>

Introduction

Referring Video Object Segmentation (R-VOS) methods face challenges in maintaining consistent object segmentation due to temporal context variability and the presence of other visually similar objects. We propose the first end-to-end paradigm that identifies aligned frames for text-conditioned segmentation and propagates mask features to achieve temporally consistent R-VOS.. Furthermore, we propose a new Mask Consistency Score (MCS) metric to evaluate the temporal consistency of video segmentation. Extensive experiments demonstrate that our approach enhances temporal consistency by a significant margin, leading to top-ranked performance on popular R-VOS benchmarks.

https://github.com/bo-miao/HTR/assets/53172019/7b2e7d56-59f8-4ba2-b502-c4e7ed9e0417

Installation and Data Preparation

Please refer to SgMg for installation and data preparation.

Evaluation

The checkpoint for HTR w/ SwinL is available at HTR-SwinL.

If you want to evaluate HTR on Ref-DAVIS/YouTube-VOS, please run the following command in the scripts folder:

sh dist_test_davis_swinl.sh
sh dist_test_ytv_swinl.sh

MCS Metric for Temporal Consistency

The code for MCS evaluation is in get_mcs.py. Please click View scoring output log to download stdout.txt of your submission in Ref-YTVOS eval server.

Then you can run the script to get the MCS score under different thresholds.

Citation

@article{miao2024htr,
  title={Temporally Consistent Referring Video Object Segmentation with Hybrid Memory},
  author={Miao, Bo and Bennamoun, Mohammed and Gao, Yongsheng and Shah, Mubarak and Mian, Ajmal},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2024},
  publisher={IEEE}
}

Acknowledgements

Contact

If you have any questions about this project, please feel free to contact bomiaobbb@gmail.com.

View on GitHub
GitHub Stars19
CategoryContent
Updated2mo ago
Forks2

Languages

Python

Security Score

90/100

Audited on Jan 23, 2026

No findings