DsHmp

[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Generate Convert Improve

Install / Use

/learn @heshuting555/DsHmp

About this skill

Quality Score

0/100

README

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

📄[arXiv] 📄[PDF]

This repository contains code for CVPR2024 paper:

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Shuting He, Henghui Ding
CVPR 2024

Installation:

Please see INSTALL.md. Then

pip install -r requirements.txt
python3 -m spacy download en_core_web_sm

Inference

1. Valu set

Obtain the output masks of Valu set:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto --eval-only \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [output_dir]

Obtain the J&F results on Valu set:

python tools/eval_mevis.py

2. Val set

Obtain the output masks of Val set for CodaLab online evaluation:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto --eval-only \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [output_dir] DATASETS.TEST '("mevis_test",)'

Training

Firstly, download the backbone weights (model_final_86143f.pkl) and convert it using the script:

wget https://dl.fbaipublicfiles.com/maskformer/mask2former/coco/instance/maskformer2_swin_tiny_bs16_50ep/model_final_86143f.pkl
python tools/process_ckpt.py
python tools/get_refer_id.py

Then start training:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [path_to_weights]

Note: We train on a 3090 machine using 8 cards with 1 sample on each card, taking about 17 hours.

Models

☁️ Google Drive

Acknowledgement

This project is based on MeViS. Many thanks to the authors for their great works!

BibTeX

Please consider to cite DsHmp if it helps your research.

@inproceedings{DsHmp,
  title={Decoupling static and hierarchical motion perception for referring video segmentation},
  author={He, Shuting and Ding, Henghui},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13332--13341},
  year={2024}
}

Related Skills

qqbot-channel

352.2k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

100.6k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

352.2k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

arscontexta

3.1k

Claude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.

heshuting555

View profile

View on GitHub

GitHub Stars86

CategoryContent

Updated13d ago

Forks3

heshuting555/DsHmp

Languages

Python

Security Score

80/100

Audited on Mar 26, 2026

No findings