ProposeReduce
Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)
Install / Use
/learn @JIA-Lab-research/ProposeReduceREADME
Propose-Reduce VIS
This repo contains the official implementation for the paper:
Video Instance Segmentation with a Propose-Reduce Paradigm
Huaijia Lin*, Ruizheng Wu*, Shu Liu, Jiangbo Lu, Jiaya Jia
ICCV 2021 | Paper

Installation
Please refer to INSTALL.md.
Demo
You can compute the VIS results for your own videos.
- Download a pretrained ResNet-101 and put it in pretrained folder.
mkdir pretrained
- Put example videos in 'demo/inputs'. We support two types of inputs, frames directories or .mp4 files (see example for details).
- Run the following script and obtain the results in demo/outputs.
sh demo.sh
Data Preparation
(1) Download the videos and jsons of train and val sets from YouTube-VIS 2019
(2) Download the videos and jsons of train and val sets from YouTube-VIS 2021
(3) Download the trainval set of DAVIS-UVOS
(4) Download other pre-computed jsons from data
(5) Symlink the corresponding dataset and json files to the data folder
mkdir data
data
├── trainset_ytv19 --> /path/to/ytv2019/vos/train/JPEGImages/
├── train_ytv19.json --> /path/to/ytv2019/vis/train.json
├── valset_ytv19 --> /path/to/ytv2019/vos/valid/JPEGImages/
├── valid_ytv19.json --> /path/to/ytv2019/vis/valid.json
├── trainset_ytv21 --> /path/to/ytv2021/vis/train/JPEGImages/
├── train_ytv21.json --> /path/to/ytv2021/vis/train/instances.json
├── valset_ytv21 --> /path/to/ytv2021/vis/valid/JPEGImages/
├── valid_ytv21.json --> /path/to/ytv2021/vis/valid/instances.json
├── trainvalset_davis --> /path/to/DAVIS-UnVOS/DAVIS-trainval/JPEGImages/480p/
├── train_davis.json --> /path/to/pre-computed/train_davis.json
├── valid_davis.json --> /path/to/pre-computed/valid_davis.json
Results
We provide the results of several pretrained models and corresponding scripts on different backbones. The results have slight differences from the paper because we make minor modifications to the inference codes.
Download the pretrained models and put them in pretrained folder.
mkdir pretrained
<table><tbody>
<!-- START TABLE -->
<!-- TABLE HEADER -->
<th valign="center">Dataset</th>
<th valign="center">Method</th>
<th valign="center">Backbone</th>
<th valign="center"> <a href=https://github.com/dvlab-research/ProposeReduce#todos>CA Reduce</a> </th>
<th valign="center">AP</th>
<th valign="center">AR@10</th>
<th valign="bottom">download</th>
<tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-50</td>
<td align="center"></td>
<td align="center"> 40.8 </td>
<td align="center"> 49.9 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1P3HiwCavjRJJePuF-4D2GDQKwWT8E_LZ/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/eval_vis_r50.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-50</td>
<td align="center"> ✓ </td>
<td align="center"> 42.5 </td>
<td align="center"> 56.8 </td>
<td align="center"> <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/CateAwareReduce/eval_vis_r50.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-101</td>
<td align="center"></td>
<td align="center"> 43.8 </td>
<td align="center"> 52.7 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1SmcJsIqluzjuH-uKCNs1ybNqvQClIqai/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/eval_vis_r101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-101</td>
<td align="center"> ✓ </td>
<td align="center"> 45.2 </td>
<td align="center"> 59.0 </td>
<td align="center"> <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/CateAwareReduce/eval_vis_r101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNeXt-101</td>
<td align="center"></td>
<td align="center"> 47.6 </td>
<td align="center"> 56.7 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1lwjdGhjeA8rFtHtYrJbsVPY6r49jGGbN/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/eval_vis_x101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2019</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNeXt-101</td>
<td align="center"> ✓ </td>
<td align="center"> 48.8 </td>
<td align="center"> 62.2 </td>
<td align="center"> <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2019/CateAwareReduce/eval_vis_x101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<td align="center"></td>
<!-- <td align="center"> To be released </td> -->
<tr><td align="center">YouTube-VIS 2021</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-50</td>
<td align="center"></td>
<td align="center"> 39.6 </td>
<td align="center"> 47.5 </td>
<td align="center"> <a href="https://drive.google.com/file/d/12NQMY59USqMi7--zyZytKVaUmf0MGegP/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2021/eval_vis_r50.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><td align="center">YouTube-VIS 2021</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-50</td>
<td align="center"> ✓ </td>
<td align="center"> 41.7 </td>
<td align="center"> 54.9 </td>
<td align="center"> <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2021/CateAwareReduce/eval_vis_r50.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2021</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNeXt-101</td>
<td align="center"> </td>
<td align="center"> 45.6 </td>
<td align="center"> 52.9 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1aOHPmVkoF9ZeBOSORlybPBqpZoIqg2SA/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2021/eval_vis_x101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">YouTube-VIS 2021</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNeXt-101</td>
<td align="center"> ✓ </td>
<td align="center"> 47.2 </td>
<td align="center"> 57.6 </td>
<td align="center"> <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/YTV2021/CateAwareReduce/eval_vis_x101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
</tbody></table>
<table><tbody>
<!-- START TABLE -->
<!-- TABLE HEADER -->
<th valign="center">Dataset</th>
<th valign="center">Method</th>
<th valign="center">Backbone</th>
<th valign="center">J&F</th>
<th valign="center">J</th>
<th valign="center">F</th>
<th valign="bottom">download</th>
<tr><tr><td align="center">DAVIS-UVOS</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNet-101</td>
<td align="center"> 68.1 </td>
<td align="center"> 64.9 </td>
<td align="center"> 71.4 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1gOgpEQ1rhFVCRRqR98Jr4s9MhWMUPvzl/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/DAVIS/eval_vis_r101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
<tr><tr><td align="center">DAVIS-UVOS</td>
<td align="center">Seq Mask R-CNN</td>
<td align="center">ResNeXt-101</td>
<td align="center"> 70.6 </td>
<td align="center"> 67.2 </td>
<td align="center"> 73.9 </td>
<td align="center"> <a href="https://drive.google.com/file/d/1fKNCS2ONTD3q9B4oB8TCTpMz7J0CLNtX/view?usp=sharing">model</a> | <a href="https://github.com/dvlab-research/ProposeReduce/blob/main/scripts/DAVIS/eval_vis_x101.sh">scripts</a> </td>
<!-- <td align="center"> To be released </td> -->
</tbody></table>
Evaluation
YouTube-VIS 2019: A json file will be saved in ../Results_ytv19 folder. Please zip and upload to the codalab server.
YouTube-VIS 2021: A json file will be saved in ../Results_ytv21 folder. Please zip and upload to the codalab server.
DAVIS-UVOS: Color masks will be saved in ../Results_davis folder. Please use the official code for evaluation.
Training
To reproduce t
Related Skills
qqbot-channel
346.4kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.1k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
346.4kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
