Personfromvid
AI-powered video frame extraction tool that automatically identifies and extracts high-quality frames containing people, with intelligent pose categorization (standing/sitting/squatting), head orientation detection, and shot type classification. Features GPU acceleration, resumable processing, and extensive configuration options.
Install / Use
/learn @codeprimate/PersonfromvidREADME
Person From Vid
AI-powered video frame extraction and pose categorization tool designed for creating high-quality training datasets. Analyzes video files to identify and extract consistent, well-posed frames containing people - perfect for LoRA training, face datasets, and machine learning applications.
Features
🎯 Dataset Creation Focused
- 📐 Consistent Crop Formats: Generate uniform square (1:1), portrait (4:3), or widescreen (16:9) crops perfect for ML training
- ✨ AI-Powered Face Restoration: GFPGAN enhancement automatically improves face quality in your dataset
- 🔄 Resumable Batch Processing: Process hundreds of videos reliably - interruptions automatically resume where they left off
- 📊 Quality-Filtered Output: Only saves high-quality, well-posed frames using advanced blur, brightness, and contrast metrics
🤖 Advanced AI Analysis
- 🎥 Multi-Format Video Support: Works with MP4, AVI, MOV, MKV, WebM, and more
- 🧠 Intelligent Detection: Uses state-of-the-art models for face detection (
yolov8s-face), pose estimation (yolov8s-pose), and head pose analysis (sixdrepnet) - 📐 Pose & Shot Classification: Automatically categorizes poses (standing, sitting, squatting) and shot types (closeup, medium shot, full body)
- 👤 Head Orientation Analysis: Classifies head directions into 9 cardinal orientations (front, profile, looking up/down, etc.)
⚡ Performance & Workflow
- 🚀 GPU Acceleration: Optional CUDA/MPS support for significantly faster processing of large datasets
- 🧠 Smart Frame Selection: Keyframe detection, temporal sampling, and deduplication ensure diverse, high-quality results
- 📊 Rich Progress Tracking: Modern console interface with real-time progress displays
- ⚙️ Highly Configurable: Extensive configuration options via CLI, YAML files, or environment variables
Installation
Prerequisites
- Python 3.10 or higher
- FFmpeg (for video processing)
Installing FFmpeg
macOS:
brew install ffmpeg
Ubuntu/Debian:
sudo apt update
sudo apt install ffmpeg
Windows: Download from FFmpeg official website or use:
choco install ffmpeg # Using Chocolatey
Install Person From Vid
From PyPI
The recommended way to install is via pip:
pip install personfromvid
From Source
Install uv, then clone and sync:
git clone https://github.com/personfromvid/personfromvid.git
cd personfromvid
uv sync
For an editable install with development dependencies (tests, linters, build tools):
uv sync --extra dev
Without uv, you can still use a virtualenv and pip:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"
Quick Start
# Standard square crops
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration
# Specific resolution
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --resize 512 --output-dir ./dataset
# Portrait-oriented crops with enhanced faces
personfromvid video.mp4 --crop-ratio 4:3 --face-restoration --resize 768
# Batch processing multiple videos to the same dataset directory
personfromvid video1.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video2.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video3.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
Resume & Processing Control
# Normal processing automatically resumes from where it left off
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration
# Force restart from beginning (clears previous state)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --force
# Keep extracted frames and data files between runs (useful for incremental processing)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --keep-temp
💡 Pro Tips for Dataset Creation:
- Use
--crop-ratio 1:1for square crops (most compatible with ML models) - Enable
--face-restorationfor higher quality faces in your dataset - Set
--resize 512or--resize 768for consistent resolutions - Processing is resumable - interrupted sessions automatically continue where they left off
- Use
--forceto restart processing from the beginning when needed
Basic Usage
# Simple processing (saves to video's directory)
personfromvid video.mp4
# Specify output directory
personfromvid video.mp4 --output-dir ./extracted_frames
# Use GPU for faster processing (recommended for large datasets)
personfromvid video.mp4 --device gpu
# Verbose output for monitoring progress
personfromvid video.mp4 --verbose
Advanced Crop Options
# Variable aspect ratio crops (preserve natural proportions)
personfromvid video.mp4 --crop-ratio any --crop-padding 0.2
# Widescreen crops with full frames included
personfromvid video.mp4 --crop-ratio 16:9 --full-frames --output-dir ./widescreen
# Custom padding and high-quality output
personfromvid video.mp4 --crop-ratio 1:1 --crop-padding 0.3 --output-jpg-quality 98
# Large dataset processing with limits
personfromvid video.mp4 \
--crop-ratio 1:1 \
--face-restoration \
--max-frames 1000 \
--batch-size 16 \
--device gpu
Crop Ratio Options:
1:1(square): Most common for ML training, ensures consistent dimensions16:9(widescreen): Good for cinematic shots, wider context4:3(portrait): Better for full-body poses, traditional aspect ratioany: Preserves natural proportions while applying padding- Omit
--crop-ratio: No cropping, outputs full frames only
Processing Control
# Force restart processing (clears previous state)
personfromvid video.mp4 --force
# Keep temporary files for debugging
personfromvid video.mp4 --keep-temp
# Disable face restoration for faster processing
personfromvid video.mp4 --no-face-restoration
# Custom face restoration strength (0.0-1.0)
personfromvid video.mp4 --face-restoration --face-restoration-strength 0.9
Batch Processing Workflows
Creating Large Datasets
For processing multiple videos with consistent settings, create a simple script or use shell commands:
# Process all MP4 files in a directory to create a unified dataset
for video in *.mp4; do
personfromvid "$video" \
--crop-ratio 1:1 \
--face-restoration \
--resize 512 \
--output-format jpg \
--output-dir ./training_dataset
done
# Or using find for recursive processing
find ./videos -name "*.mp4" -exec personfromvid {} \
--crop-ratio 1:1 \
--face-restoration \
--output-dir ./dataset \
--device gpu \;
Configuration File Approach
For consistent settings across multiple runs, use a configuration file:
# dataset_config.yaml
output:
image:
format: "jpg"
jpg:
quality: 95
crop_ratio: "1:1"
face_restoration_enabled: true
resize: 512
models:
device: "gpu"
batch_size: 8
Then process videos:
personfromvid video1.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video2.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video3.mp4 --config dataset_config.yaml --output-dir ./dataset
Quality Control and Dataset Curation
# High-quality settings for final dataset
personfromvid video.mp4 \
--crop-ratio 1:1 \
--face-restoration \
--resize 768 \
--output-jpg-quality 98 \
--quality-threshold 0.4 \
--confidence 0.5 \
--max-frames-per-category 8
# Quick preview with lower quality for initial review
personfromvid video.mp4 \
--crop-ratio 1:1 \
--resize 256 \
--max-frames 50 \
--output-dir ./preview
Command-line Options
personfromvid offers many options to customize its behavior. Here are the available options:
General Options
| Option | Alias | Description | Default |
| --- | --- | --- | --- |
| --config | -c | Path to a YAML or JSON configuration file. | None |
| --output-dir | -o | Directory to save output files. | Video's directory |
| --log-level | -l | Set logging level (DEBUG, INFO, WARNING, ERROR). | INFO |
| --verbose | -v | Enable verbose output (sets log level to DEBUG). | False |
| --quiet | -q | Suppress non-essential output. | False |
| --no-structured-output | | Disable structured output format (use basic logging). | False |
| --version | | Show version information and exit. | False |
AI Model Options
| Option | Description | Default |
| --- | --- | --- |
| --device | Device to use for AI models (auto, cpu, gpu). | auto |
| --batch-size | Batch size for AI model inference (1-64). | 1 |
| --confidence | Confidence threshold for detections (0.0-1.0). | 0.3 |
Frame Processing Options
| Option | Description | Default |
| --- | --- | --- |
| --max-frames | Maximum frames to extract per video. | None |
| --quality-threshold | Quality threshold for frame selection (0.0-1.0). | 0.2 |
Output Options
| Option | Description | Default |
| --- | --- | --- |
| --output-format | Output image format (jpg or png). | jpg |
| --output-jpg-quality | Quality for JPG output (70-100). | 95 |
| --output-face-crop-enabled / --no-output-face-crop-enabled | Enable or disable generation of cropped face images. | True |
| --output-face-crop-padding | Padding around face bounding box (0.0-1.0). | 0.3 |
Related Skills
qqbot-channel
344.1kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
99.8k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
344.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
