SkillAgentSearch skills...

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Install / Use

/learn @Vchitect/VBench
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

vbench_logo

<!-- [![arXiv](https://img.shields.io/badge/arXiv-2311.99999-b31b1b.svg)](https://arxiv.org/abs/2311.99999) -->

HuggingFace VBench Arena (View Generated Videos Here!) VBench-2.0 Arena (View Generated Videos Here!) Project Page Project Page Dataset Download PyPI Video Video Visitors

This repository provides unified implementations for the VBench series of works, supporting comprehensive evaluation of video generative models across a wide spectrum of capabilities and settings.

If your questions are not addressed in this README, please contact Ziqi Huang at ZIQI002 [at] e [dot] ntu [dot] edu [dot] sg.

Table of Contents

<a name="overview"></a>

:mega: Overview

This repository provides unified implementations for the VBench series of works, supporting comprehensive evaluation of video generative models across a wide spectrum of capabilities and settings.

(1) VBench

TL;DR: Evaluating Video Generation — Benchmark • Evaluation Dimensions • Evaluation Methods • Human Alignment • Insights

VBench Paper (CVPR 2024) VBench: Comprehensive Benchmark Suite for Video Generative Models <br> Ziqi Huang<sup></sup>, Yinan He<sup></sup>, Jiashuo Yu<sup></sup>, Fan Zhang<sup></sup>, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin<sup>+</sup>, Yu Qiao<sup>+</sup>, Ziwei Liu<sup>+</sup><br> IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

overall_structure

We propose VBench, a comprehensive benchmark suite for video generative models. We design a comprehensive and hierarchical <b>Evaluation Dimension Suite</b> to decompose "video generation quality" into multiple well-defined dimensions to facilitate fine-grained and objective evaluation. For each dimension and each content category, we carefully design a <b>Prompt Suite</b> as test cases, and sample <b>Generated Videos</b> from a set of video generation models. For each evaluation dimension, we specifically design an <b>Evaluation Method Suite</b>, which uses carefully crafted method or designated pipeline for automatic objective evaluation. We also conduct <b>Human Preference Annotation</b> for the generated videos for each dimension, and show that VBench evaluation results are <b>well aligned with human perceptions</b>. VBench can provide valuable insights from multiple perspectives.

Note: The code and README for the VBench components are located here, relative path: ..

@InProceedings{huang2023vbench,
    title={{VBench}: Comprehensive Benchmark Suite for Video Generative Models},
    author={Huang, Ziqi and He, Yinan and Yu, Jiashuo and Zhang, Fan and Si, Chenyang and Jiang, Yuming and Zhang, Yuanhan and Wu, Tianxing and Jin, Qingyang and Chanpaisit, Nattapol and Wang, Yaohui and Chen, Xinyuan and Wang, Limin and Lin, Dahua and Qiao, Yu and Liu, Ziwei},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2024}
}

(2) VBench++

TL;DR: Extends VBench with (1) VBench-I2V for image-to-video, (2) VBench-Long for long videos, and (3) VBench-Trustworthiness covering fairness, bias, and safety.

VBench++ (TPAMI 2025) VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models <br> Ziqi Huang<sup></sup>, Fan Zhang<sup></sup>, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin<sup>+</sup>, Yu Qiao<sup>+</sup>, Ziwei Liu<sup>+</sup><br> IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

overall_structure

<b>VBench++</b> supports a wide range of video generation tasks, including text-to-video and image-to-video, with an adaptive Image Suite for fair evaluation across different settings. It evaluates not only technical quality but also the trustworthiness of generative models, offering a comprehensive view of model performance. We continually incorporate more video generative models into VBench to inform the community about the evolving landscape of video generation.

Note: The code and README for the VBench++ components are located at:

  • (1) VBench-I2V (image-to-video): link, relative path: vbench2_beta_i2v
  • (2) VBench-Long (long video evaluation): link, relative path: vbench2_beta_long
  • (3) VBench-Trustworthiness (fairness, bias, and safety): link, relative path: vbench2_beta_trustworthiness

*These modules belong to VBench++, not VBench, or VBench-2.0. However, to maintain backward compatibility for users who have already installed the repository, we preserve the original relative path names and provide this clarification here. *

@article{huang2025vbench++,
    title={{VBench++}: Comprehensive and Versatile Benchmark Suite for Video Generative Models},
    author={Huang, Ziqi and Zhang, Fan and Xu, Xiaojie and He, Yinan and Yu, Jiashuo and Dong, Ziyue and Ma, Qianli and Chanpaisit, Nattapol and Si, Chenyang and Jiang, Yuming and Wang, Yaohui and Chen, Xinyuan and Chen, Ying-Cong and Wang, Limin and Lin, Dahua and Qiao, Yu and Liu, Ziwei},
    journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
    year={2025},
    doi={10.1109/TPAMI.2025.3633890}
}

(3) VBench-2.0

TL;DR: Extends VBench to evaluate intrinsic faithfulness — a key challenge for next-generation video generation models.

VBench-2.0 Report (arXiv) VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness<br> Dian Zheng<sup></sup>, Ziqi Huang<sup></sup>, Hongbo Liu, Kai Zou, Yinan He, Fan Zhang, Yuanhan Zhang, Jingwen He, [Wei-Shi Zheng](https://www.isee-ai.cn/~zhwsh

Related Skills

View on GitHub
GitHub Stars1.6k
CategoryContent
Updated9h ago
Forks109

Languages

Python

Security Score

100/100

Audited on Apr 2, 2026

No findings