VideoDataset

A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

[!WARNING] VideoDataset is in the Alpha phase. Frequent changes and instability should be anticipated. Any feedback, comments, suggestions and contributions are welcome!

Overview

VideoDataset is a high-performance video decoding multi-framework supporting library. It aims to provide framework-integrated solutions for working with video decoding tasks.

Key Features:

GPU-accelerated video decoding using NvCodec library
Support for common video formats (H.264, H.265, etc.)
Easy integration with multi-frameworks and multi-formats.

Installation

Prerequisites

NVIDIA GPU with CUDA support and CUDA Toolkit 12.0+ installed
Python 3.10 or later

Install from PyPI

pip install agibot-videodataset

Building from Source

pip install git+https://github.com/AgiBot-World/VideoDataset.git

Quick Start

The complete example can be found in the quickstart documentation.

Documentation

Please refer to full documentation here.

Also, a sphinx-based documentation can be generated by running the following command:

make dev-doc doc

It will generate the documentation in the docs/_build/html directory and serve it on http://localhost:8000.

Performance

VideoDataset is optimized for high-throughput video processing. Benchmark results show:

GPU Decoding: A decoding throughput of 20,000 FPS is achieved in a multiprocessing scenario.
Random Access: Minimal overhead for non-sequential frame access.
GPU Decoder Utilization: Over 90% GPU decoder utilization is achieved in a multiprocessing scenario.

See the benchmark documentation for detailed performance analysis.

Comparison with other CPU decoding solutions

In addition, we conducted a comprehensive benchmark comparing it against mainstream CPU software decoding solutions, including OpenCV, Torchvision (PyAV), Torchvision (VideoReader), and TorchCodec (CPU).The results demonstrate that VideoDataset achieves a 3 to 4 times improvement in decoding throughput.

CPU Throughput

Furthermore, it also demonstrates outstanding performance in reducing CPU utilization.

CPU Utilization

Development Status

[X] GPU acceleration via NvCodec
[X] Random frame access
[X] PyTorch integration
[ ] Compatibility with LeRobot
[ ] Asynchronous pipeline optimization

License

MIT License, for more details, see the LICENSE file.

VideoDataset

Install / Use

README

VideoDataset

Overview

Installation

Prerequisites

Install from PyPI

Building from Source

Quick Start

Documentation

Performance

Comparison with other CPU decoding solutions

Development Status

License