Ffmpegcv
The ffmpegcv is a ffmpeg backbone for open-cv like Video Reader and Writer
Install / Use
/learn @chenxinfeng4/FfmpegcvREADME
FFMPEGCV is an alternative to OPENCV for video reading&writing.
English Version | 中文版本
Here is the Python version of ffmpegcv. For the C++ version, please visit FFMPEGCV-CPP
The ffmpegcv provide Video Reader and Video Witer with ffmpeg backbone, which are faster and powerful than cv2. Integrating ffmpegcv into your deeplearning pipeline is very smooth.
- The ffmpegcv is api compatible to open-cv.
- The ffmpegcv can use GPU accelerate encoding and decoding*.
- The ffmpegcv supports much more video codecs v.s. open-cv.
- The ffmpegcv supports RGB & BGR & GRAY format as you like.
- The ffmpegcv supports fp32 CHW & HWC format shortcut to CUDA memory.
- The ffmpegcv supports Stream reading (IP Camera) in low latency.
- The ffmpegcv supports ROI operations.You can crop, resize and pad the ROI.
In all, ffmpegcv is just similar to opencv api. But it has more codecs and does't require opencv installed at all. It's great for deeplearning pipeline.
<p align="center"> <img src="https://i.imghippo.com/files/cg9641723107581.jpg" width="95%"> </p>Functions:
VideoWriter: Write a video file.VideoCapture: Read a video file.VideoCaptureNV: Read a video file by NVIDIA GPU.VideoCaptureQSV: Read a video file by Intel QuickSync Video.VideoCaptureCAM: Read a camera.VideoCaptureStream: Read a RTP/RTSP/RTMP/HTTP stream.VideoCaptureStreamRT: Read a RTSP stream (IP Camera) in real time low latency as possible.noblock: Read/Write a video file in background using mulitprocssing.toCUDA: Translate a video/stream as CHW/HWC-float32 format into CUDA device, >2x faster.
Install
You need to download ffmpeg before you can use ffmpegcv.
#1A. LINUX: sudo apt install ffmpeg
#1B. MAC (No NVIDIA GPU): brew install ffmpeg
#1C. WINDOWS: download ffmpeg and add to the path
#1D. CONDA: conda install ffmpeg=6.0.0 #don't use the default 4.x.x version
#2A. python
pip install ffmpegcv #stable verison
pip install git+https://github.com/chenxinfeng4/ffmpegcv #latest verison
#2B. recommand only when you want advanced functions. See the toCUDA section
pip install ffmpegcv[cuda]
When should choose ffmpegcv other than opencv:
- The
opencvis hard to install. The ffmpegcv only requiresnumpyandFFmpeg, works across Mac/Windows/Linux platforms. - The
opencvpackages too much image processing toolbox. You just want a simple video/camero IO with GPU accessible. - The
opencvdidn't support profilingh264/h265and other video writers. - You want to crop, resize and pad the video/camero ROI.
- You are interested in deeplearning pipeline.
Basic example
Read a video by CPU, and rewrite it by GPU.
vidin = ffmpegcv.VideoCapture(vfile_in)
vidout = ffmpegcv.VideoWriterNV(vfile_out, 'h264', vidin.fps) #NVIDIA-GPU
with vidin, vidout:
for frame in vidin:
cv2.imshow('image', frame)
vidout.write(frame)
Read the camera.
# by device ID
cap = ffmpegcv.VideoCaptureCAM(0)
# by device name
cap = ffmpegcv.VideoCaptureCAM("Integrated Camera")
Deeplearning pipeline.
"""
—————————— NVIDIA GPU accelerating ⤴⤴ ———————
| |
V V
video -> decode -> crop -> resize -> RGB -> CUDA:CHW float32 -> model
"""
cap = ffmpegcv.toCUDA(
ffmpegcv.VideoCaptureNV(file, pix_fmt='nv12', resize=(W,H)),
tensor_format='chw')
for frame_CHW_cuda in cap:
frame_CHW_cuda = (frame_CHW_cuda - mean) / std
result = model(frame_CHW_cuda)
Cross platform
The ffmpegcv is based on Python+FFmpeg, it can cross platform among Windows, Linux, Mac, X86, Armsystems.
GPU Acceleration
- Support NVIDIA card only, test in x86_64 only.
- Works in Windows, Linux and Anaconda.
- Works in the Google Colab notebook.
- Infeasible in the MacOS. That ffmpeg didn't supports NVIDIA at all.
* The ffmegcv GPU reader is a bit slower than CPU reader, but much faster when use ROI operations (crop, resize, pad).
Codecs
| Codecs | OpenCV-reader | ffmpegcv-cpu-r | gpu-r | OpenCV-writer | ffmpegcv-cpu-w | gpu-w | | ----------- | ------------- | ---------------- | ---- | ------------- | ---------------- | ---- | | h264 | √ | √ | √ | × | √ | √ | | h265 (hevc) | not sure | √ | √ | × | √ | √ | | mjpeg | √ | √ | × | √ | √ | × | | mpeg | √ | √ | × | √ | √ | × | | others | not sure | ffmpeg -decoders | × | not sure | ffmpeg -encoders | × |
Benchmark
On the way...(maybe never)
Video Reader
The ffmpegcv is just similar to opencv in api.
# open cv
import cv2
cap = cv2.VideoCapture(file)
while True:
ret, frame = cap.read()
if not ret:
break
pass
# ffmpegcv
import ffmpegcv
cap = ffmpegcv.VideoCapture(file)
while True:
ret, frame = cap.read()
if not ret:
break
pass
cap.release()
# alternative
cap = ffmpegcv.VideoCapture(file)
nframe = len(cap)
for frame in cap:
pass
cap.release()
# more pythonic, recommand
with ffmpegcv.VideoCapture(file) as cap:
nframe = len(cap)
for iframe, frame in enumerate(cap):
if iframe>100: break
pass
Use GPU to accelerate decoding. It depends on the video codes. h264_nvcuvid, hevc_nvcuvid ....
cap_cpu = ffmpegcv.VideoCapture(file)
cap_gpu0 = ffmpegcv.VideoCaptureNV(file) #NVIDIA GPU0
cap_gpu1 = ffmpegcv.VideoCaptureNV(file, gpu=1) #NVIDIA GPU1
cap_qsv = ffmpegcv.VideoCaptureQSV(file) #Intel QSV, experimental
Use rgb24 instead of bgr24. The gray version would be more efficient.
cap = ffmpegcv.VideoCapture(file, pix_fmt='rgb24') #rgb24, bgr24, gray
ret, frame = cap.read()
plt.imshow(frame)
ROI Operations
You can crop, resize and pad the video. These ROI operation is ffmpegcv-GPU > ffmpegcv-CPU >> opencv.
Crop video, which will be much faster than read the whole canvas. The top-left corner is (0, 0).
cap = ffmpegcv.VideoCapture(file, crop_xywh=(0, 0, 640, 480))
Resize the video to the given size.
cap = ffmpegcv.VideoCapture(file, resize=(640, 480))
Resize and keep the aspect ratio with black border padding.
cap = ffmpegcv.VideoCapture(file, resize=(640, 480), resize_keepratio=True)
Crop and then resize the video.
cap = ffmpegcv.VideoCapture(file, crop_xywh=(0, 0, 640, 480), resize=(512, 512))
Extend Options
INFILE_OPTIONS: Add extra options to ffmpeg input.
cap = ffmpegcv.VideoCapture(file, infile_options='-re -stream_loop -1')
# equivalent ffmpeg command
ffmpeg INFILE_OPTIONS -i FILE -f rawvideo pipe:
toCUDA device
The ffmpegcv can translate the video/stream from HWC-uint8 cpu to CHW-float32 in CUDA device. It significantly reduce your cpu load, and get >2x faster than your manually convertion.
Prepare your environment. The cuda environment is required. The pycuda package is required. The pytorch package is non-essential.
nvcc --version # check you've installed NVIDIA CUDA Compiler. Already installed if you've installed Tensorflow-gpu or Pytorch-gpu
pip install ffmpegcv[cuda] #auto install pycuda
# Read a video file to CUDA device, original
cap = ffmpegcv.VideoCaptureNV(file, pix_fmt='rgb24')
ret, frame_HWC_CPU = cap.read()
frame_CHW_CUDA = torch.from_numpy(frame_HWC_CPU).permute(2, 0, 1).cuda().contiguous().float() # 120fps, 1200% CPU load
# speed up
cap = toCUDA(ffmpegcv.VideoCapture(file, pix_fmt='yuv420p')) #pix_fmt: 'yuv420p' or 'nv12' only
cap = toCUDA(ffmpegcv.VideoCaptureNV(file, pix_fmt='nv12')) #'nv12' is better for gpu
cap = toCUDA(vid, tensor_format='chw') #tensor format:'chw'(default) or 'hwc', fp32 precision
cap = toCUDA(vid, gpu=1) #choose gpu
# read to the cuda device
ret, frame_CHW_pycuda = cap.read() #380fps, 200% CPU load, dtype is [pycuda array]
ret, frame_CHW_pycudamem = cap.read_cudamem() #dtype is [pycuda mem_alloc]
ret, frame_CHW_CUDA = cap.read_torch() #dtype is [pytorch tensor]
ret, _ = cap.read_torch(frame_CHW_CUDA) #no copy, but need to specify the output memory
frame_CHW_pycuda[:] = (frame_CHW_pycuda - mean) / std #normalize
How can toCUDA make it faster in your deeplearning pipeline than opencv or ffmpeg?
- The opencv/ffmpeg uses the cpu to convert video pix_fmt from original YUV to RGB24, which is slow. The ffmpegcv use the cuda to accelerate pix_fmt convertion.
- Use
yuv420pornv12can save the cpu load and reduce the memory copy from CPU to GPU.- The ffmpeg stores the image as HWC format. The ffmpegcv can use HWC & CHW format to accelerate the video reading.
Video Writer
# cv2
out = cv2.VideoWriter('outpy.avi',
cv2.VideoWriter_fourcc('M','J','P','G'),
10,
(w, h))
out.write(frame1)
out.write(frame2)
out.release()
# ffmpegcv, default codec is 'h264' in cpu 'h265' in gpu.
# frameSize is decided by the size of the first frame.
#
