SkillAgentSearch skills...

VideoDataset

A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

Install / Use

/learn @AgiBot-World/VideoDataset
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

VideoDataset

<!-- SPHINX-START -->

A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

Documentation License SS Badge

CI CD CommitLint Renovate Semantic Release Coverage

Release PyPI PyPI - Python Version GitHub

pre-commit Checked with mypy Ruff Conventional Commits Copier Serious Scaffold Python

[!WARNING] VideoDataset is in the Alpha phase. Frequent changes and instability should be anticipated. Any feedback, comments, suggestions and contributions are welcome!

Overview

VideoDataset is a high-performance video decoding multi-framework supporting library. It aims to provide framework-integrated solutions for working with video decoding tasks.

Key Features:

  • GPU-accelerated video decoding using NvCodec library
  • Support for common video formats (H.264, H.265, etc.)
  • Easy integration with multi-frameworks and multi-formats.

Installation

Prerequisites

  • NVIDIA GPU with CUDA support and CUDA Toolkit 12.0+ installed
  • Python 3.10 or later

Install from PyPI

pip install agibot-videodataset

Building from Source

pip install git+https://github.com/AgiBot-World/VideoDataset.git

Quick Start

The complete example can be found in the quickstart documentation.

Documentation

Please refer to full documentation here.

Also, a sphinx-based documentation can be generated by running the following command:

make dev-doc doc

It will generate the documentation in the docs/_build/html directory and serve it on http://localhost:8000.

Performance

VideoDataset is optimized for high-throughput video processing. Benchmark results show:

  • GPU Decoding: A decoding throughput of 20,000 FPS is achieved in a multiprocessing scenario.
  • Random Access: Minimal overhead for non-sequential frame access.
  • GPU Decoder Utilization: Over 90% GPU decoder utilization is achieved in a multiprocessing scenario.

See the benchmark documentation for detailed performance analysis.

Comparison with other CPU decoding solutions

In addition​, we conducted a comprehensive benchmark comparing it against mainstream CPU software decoding solutions, including OpenCV, Torchvision (PyAV), Torchvision (VideoReader), and TorchCodec (CPU).The results demonstrate that VideoDataset achieves a 3 to 4 times improvement in decoding throughput.

CPU Throughput

Furthermore, it also demonstrates outstanding performance in reducing CPU utilization.

CPU Utilization

Development Status

  • [X] GPU acceleration via NvCodec
  • [X] Random frame access
  • [X] PyTorch integration
  • [ ] Compatibility with LeRobot
  • [ ] Asynchronous pipeline optimization

License

MIT License, for more details, see the LICENSE file.

View on GitHub
GitHub Stars59
CategoryContent
Updated20d ago
Forks4

Languages

CMake

Security Score

95/100

Audited on Mar 18, 2026

No findings