OpenPVSG

Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23

Generate Convert Improve

Install / Use

/learn @LilyDaytoy/OpenPVSG

About this skill

Quality Score

0/100

README

Panoptic Video Scene Graph Generation

https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a

<video controls> <source src="[https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a](https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a)" type="video/mp4"> Your browser does not support the video tag. </video> <a href="https://arxiv.org/abs/2311.17058" target='_blank'> <img src="https://img.shields.io/badge/Paper-CVPR%202023-b31b1b?style=flat-square"> </a>     <a href="https://jingkang50.github.io/PVSG/" target='_blank'> <img src="https://img.shields.io/badge/Page-jingkang50/PVSG-228c22?style=flat-square"> </a>     <a href="https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EpHpnXP-ta9Nu1wD6FwkDWAB0LxY8oE9VNqsgv6ln-i8QQ?e=fURefF" target='_blank'> <img src="https://img.shields.io/badge/Data-PVSGDataset-334b7f?style=flat-square"> </a>     <a href="https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EgvpTfCTMudLpxw-h0_BVdcBAHacUaAQD-u9OvkUlpaDBg?e=LXnqaX" target='_blank'> <img src="https://img.shields.io/badge/Data-QuickView-7de5f6?style=flat-square"> </a>     <a href="https://github.com/LilyDaytoy/OpenPVSG" target='_blank'> <img src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FLilyDaytoy%2FPVSG&count_bg=%23FFA500&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=visitors&edge_flat=true"> </a> Panoptic Video Scene Graph Generation <a href="https://jingkang50.github.io/">Jingkang Yang</a>, <a href="https://lilydaytoy.github.io/">Wenxuan Peng</a>, <a href="https://lxtgh.github.io/">Xiangtai Li</a>, <a href="https://scholar.google.com/citations?user=G8DPsoUAAAAJ&hl=zh-CN">Zujin Guo</a>, <a href="https://cliangyu.com/"> Liangyu Chen</a>, <a href="https://brianboli.com/">Bo Li</a>, <a href="https://www.linkedin.com/in/zheng-ma-4201223a/?originalSubdomain=hk">Zheng Ma</a>, <a href="https://kaiyangzhou.github.io/">Kaiyang Zhou</a>, <a href="https://bmild.github.io/">Wayne Zhang</a>, <a href="https://www.mmlab-ntu.com/person/ccloy/">Chen Change Loy</a>, <a href="https://liuziwei7.github.io/">Ziwei Liu</a>, S-Lab, Nanyang Technological University & SenseTime Research

What is PVSG Task?

The Panoptic Video Scene Graph Generation (PVSG) Task aims to interpret a complex scene video with a dynamic scene graph representation, with each node in the scene graph grounded by its pixel-accurate segmentation mask tube in the video.

| | |:--:| | Given a video, PVSG models need to generate a dynamic (temporal) scene graph that is grounded by panoptic mask tubes.|

The PVSG Dataset

We carefully collect 400 videos, each featuring dynamic scenes and rich in logical reasoning content. On average, these videos are 76.5 seconds long (5 FPS). The collection comprises 289 videos from VidOR, 55 videos from EpicKitchen, and 56 videos from Ego4D.

Please access the dataset via this link, and put the downloaded zip files to the place below.

├── assets
├── checkpoints
├── configs
├── data
├── data_zip
│   ├── Ego4D
│   │   ├── ego4d_masks.zip
│   │   └── ego4d_videos.zip
│   ├── EpicKitchen
│   │   ├── epic_kitchen_masks.zip
│   │   └── epic_kitchen_videos.zip
│   ├── VidOR
│   │   ├── vidor_masks.zip
│   │   └── vidor_videos.zip
│   └── pvsg.json
├── datasets
├── models
├── scripts
├── tools
├── utils
├── .gitignore
├── environment.yml
└── README.md

Please run unzip_and_extract.py to unzip the files and extract frames from the videos. If you use zip, make sure to use unzip -j xxx.zip to remove junk paths. You should have your data directory looks like this:

data
├── ego4d
│   ├── frames
│   ├── masks
│   └── videos
├── epic_kitchen
│   ├── frames
│   ├── masks
│   └── videos
├── vidor
│   ├── frames
│   ├── masks
│   └── videos
└── pvsg.json

We suggest our users to play with ./tools/Visualize_Dataset.ipynb to quickly get familiar with PSG dataset.

Get Started

To setup the environment, we use conda to manage our dependencies.

Our developers use CUDA 10.1 to do experiments.

You can specify the appropriate cudatoolkit version to install on your machine in the environment.yml file, and then run the following to create the conda environment:

conda env create -f environment.yml
conda activate openpvsg

You shall manually install the following dependencies.

# Install mmcv
pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
conda install -c conda-forge pycocotools
pip install mmdet==2.25.0

# already within environment.yml
pip install timm
python -m pip install scipy
pip install git+https://github.com/cocodataset/panopticapi.git

# for unitrack
pip install imageio==2.6.1
pip install lap==0.4.0
pip install cython_bbox==0.1.3

# for vps
pip install seaborn
pip install ftfy
pip install regex

# If you're using wandb for logging
pip install wandb
wandb login

Download the pretrained models for tracking if you are interested in IPS+Tracking solution.

Training and Testing

IPS+Tracking & Relation Modeling

# Train IPS
sh scripts/train/train_ips.sh
# Tracking and save query features
sh scripts/utils/prepare_qf_ips.sh
# Prepare for relation modeling
sh scripts/utils/prepare_rel_set.sh
# Train relation models
sh scripts/train/train_relation.sh
# Test
sh scripts/test/test_relation_full.sh

VPS & Relation Modeling

# Train VPS
sh scripts/train/train_vps.sh
# Save query features
sh scripts/utils/prepare_qf_vps.sh
# Prepare for relation modeling
sh scripts/utils/prepare_rel_set.sh
# Train relation models
sh scripts/train/train_relation.sh
# Test
sh scripts/test/test_relation_full.sh

Model Zoo

Method | M2F ckpt | vanilla | filter | conv | transformer | --- | --- | --- | --- | --- | --- | mask2former_ips | link | link | link | link | link | mask2former_vps | link | link | link | link | link |

Citation

If you find our repository useful for your research, please consider citing our paper:

@inproceedings{yang2023pvsg,
    author = {Yang, Jingkang and Peng, Wenxuan and Li, Xiangtai and Guo, Zujin and Chen, Liangyu and Li, Bo and Ma, Zheng and Zhou, Kaiyang and Zhang, Wayne and Loy, Chen Change and Liu, Ziwei},
    title = {Panoptic Video Scene Graph Generation},
    booktitle = {CVPR},
    year = {2023},
}

Related Skills

docs-writer

99.5k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

340.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

ddd

Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso

arscontexta

2.9k

Claude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.

LilyDaytoy

View profile

View on GitHub

GitHub Stars103

CategoryContent

Updated17d ago

Forks8

LilyDaytoy/OpenPVSG

Languages

Jupyter Notebook

Security Score

85/100

Audited on Mar 12, 2026

No findings