SkillAgentSearch skills...

Personfromvid

AI-powered video frame extraction tool that automatically identifies and extracts high-quality frames containing people, with intelligent pose categorization (standing/sitting/squatting), head orientation detection, and shot type classification. Features GPU acceleration, resumable processing, and extensive configuration options.

Install / Use

/learn @codeprimate/Personfromvid
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Person From Vid

PyPI version Python versions License: GPL-3.0-or-later

AI-powered video frame extraction and pose categorization tool designed for creating high-quality training datasets. Analyzes video files to identify and extract consistent, well-posed frames containing people - perfect for LoRA training, face datasets, and machine learning applications.

Features

🎯 Dataset Creation Focused

  • 📐 Consistent Crop Formats: Generate uniform square (1:1), portrait (4:3), or widescreen (16:9) crops perfect for ML training
  • AI-Powered Face Restoration: GFPGAN enhancement automatically improves face quality in your dataset
  • 🔄 Resumable Batch Processing: Process hundreds of videos reliably - interruptions automatically resume where they left off
  • 📊 Quality-Filtered Output: Only saves high-quality, well-posed frames using advanced blur, brightness, and contrast metrics

🤖 Advanced AI Analysis

  • 🎥 Multi-Format Video Support: Works with MP4, AVI, MOV, MKV, WebM, and more
  • 🧠 Intelligent Detection: Uses state-of-the-art models for face detection (yolov8s-face), pose estimation (yolov8s-pose), and head pose analysis (sixdrepnet)
  • 📐 Pose & Shot Classification: Automatically categorizes poses (standing, sitting, squatting) and shot types (closeup, medium shot, full body)
  • 👤 Head Orientation Analysis: Classifies head directions into 9 cardinal orientations (front, profile, looking up/down, etc.)

⚡ Performance & Workflow

  • 🚀 GPU Acceleration: Optional CUDA/MPS support for significantly faster processing of large datasets
  • 🧠 Smart Frame Selection: Keyframe detection, temporal sampling, and deduplication ensure diverse, high-quality results
  • 📊 Rich Progress Tracking: Modern console interface with real-time progress displays
  • ⚙️ Highly Configurable: Extensive configuration options via CLI, YAML files, or environment variables

Installation

Prerequisites

  • Python 3.10 or higher
  • FFmpeg (for video processing)

Installing FFmpeg

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

Windows: Download from FFmpeg official website or use:

choco install ffmpeg  # Using Chocolatey

Install Person From Vid

From PyPI

The recommended way to install is via pip:

pip install personfromvid

From Source

Install uv, then clone and sync:

git clone https://github.com/personfromvid/personfromvid.git
cd personfromvid
uv sync

For an editable install with development dependencies (tests, linters, build tools):

uv sync --extra dev

Without uv, you can still use a virtualenv and pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Quick Start

# Standard square crops
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration

# Specific resolution
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --resize 512 --output-dir ./dataset

# Portrait-oriented crops with enhanced faces
personfromvid video.mp4 --crop-ratio 4:3 --face-restoration --resize 768

# Batch processing multiple videos to the same dataset directory
personfromvid video1.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video2.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video3.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset

Resume & Processing Control

# Normal processing automatically resumes from where it left off
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration

# Force restart from beginning (clears previous state)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --force

# Keep extracted frames and data files between runs (useful for incremental processing)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --keep-temp

💡 Pro Tips for Dataset Creation:

  • Use --crop-ratio 1:1 for square crops (most compatible with ML models)
  • Enable --face-restoration for higher quality faces in your dataset
  • Set --resize 512 or --resize 768 for consistent resolutions
  • Processing is resumable - interrupted sessions automatically continue where they left off
  • Use --force to restart processing from the beginning when needed

Basic Usage

# Simple processing (saves to video's directory)
personfromvid video.mp4

# Specify output directory
personfromvid video.mp4 --output-dir ./extracted_frames

# Use GPU for faster processing (recommended for large datasets)
personfromvid video.mp4 --device gpu

# Verbose output for monitoring progress
personfromvid video.mp4 --verbose

Advanced Crop Options

# Variable aspect ratio crops (preserve natural proportions)
personfromvid video.mp4 --crop-ratio any --crop-padding 0.2

# Widescreen crops with full frames included
personfromvid video.mp4 --crop-ratio 16:9 --full-frames --output-dir ./widescreen

# Custom padding and high-quality output
personfromvid video.mp4 --crop-ratio 1:1 --crop-padding 0.3 --output-jpg-quality 98

# Large dataset processing with limits
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --face-restoration \
    --max-frames 1000 \
    --batch-size 16 \
    --device gpu

Crop Ratio Options:

  • 1:1 (square): Most common for ML training, ensures consistent dimensions
  • 16:9 (widescreen): Good for cinematic shots, wider context
  • 4:3 (portrait): Better for full-body poses, traditional aspect ratio
  • any: Preserves natural proportions while applying padding
  • Omit --crop-ratio: No cropping, outputs full frames only

Processing Control

# Force restart processing (clears previous state)
personfromvid video.mp4 --force

# Keep temporary files for debugging
personfromvid video.mp4 --keep-temp

# Disable face restoration for faster processing
personfromvid video.mp4 --no-face-restoration

# Custom face restoration strength (0.0-1.0)
personfromvid video.mp4 --face-restoration --face-restoration-strength 0.9

Batch Processing Workflows

Creating Large Datasets

For processing multiple videos with consistent settings, create a simple script or use shell commands:

# Process all MP4 files in a directory to create a unified dataset
for video in *.mp4; do
    personfromvid "$video" \
        --crop-ratio 1:1 \
        --face-restoration \
        --resize 512 \
        --output-format jpg \
        --output-dir ./training_dataset
done

# Or using find for recursive processing
find ./videos -name "*.mp4" -exec personfromvid {} \
    --crop-ratio 1:1 \
    --face-restoration \
    --output-dir ./dataset \
    --device gpu \;

Configuration File Approach

For consistent settings across multiple runs, use a configuration file:

# dataset_config.yaml
output:
  image:
    format: "jpg"
    jpg:
      quality: 95
    crop_ratio: "1:1"
    face_restoration_enabled: true
    resize: 512
models:
  device: "gpu"
  batch_size: 8

Then process videos:

personfromvid video1.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video2.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video3.mp4 --config dataset_config.yaml --output-dir ./dataset

Quality Control and Dataset Curation

# High-quality settings for final dataset
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --face-restoration \
    --resize 768 \
    --output-jpg-quality 98 \
    --quality-threshold 0.4 \
    --confidence 0.5 \
    --max-frames-per-category 8

# Quick preview with lower quality for initial review
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --resize 256 \
    --max-frames 50 \
    --output-dir ./preview

Command-line Options

personfromvid offers many options to customize its behavior. Here are the available options:

General Options

| Option | Alias | Description | Default | | --- | --- | --- | --- | | --config | -c | Path to a YAML or JSON configuration file. | None | | --output-dir | -o | Directory to save output files. | Video's directory | | --log-level | -l | Set logging level (DEBUG, INFO, WARNING, ERROR). | INFO | | --verbose | -v | Enable verbose output (sets log level to DEBUG). | False | | --quiet | -q | Suppress non-essential output. | False | | --no-structured-output | | Disable structured output format (use basic logging). | False | | --version | | Show version information and exit. | False |

AI Model Options

| Option | Description | Default | | --- | --- | --- | | --device | Device to use for AI models (auto, cpu, gpu). | auto | | --batch-size | Batch size for AI model inference (1-64). | 1 | | --confidence | Confidence threshold for detections (0.0-1.0). | 0.3 |

Frame Processing Options

| Option | Description | Default | | --- | --- | --- | | --max-frames | Maximum frames to extract per video. | None | | --quality-threshold | Quality threshold for frame selection (0.0-1.0). | 0.2 |

Output Options

| Option | Description | Default | | --- | --- | --- | | --output-format | Output image format (jpg or png). | jpg | | --output-jpg-quality | Quality for JPG output (70-100). | 95 | | --output-face-crop-enabled / --no-output-face-crop-enabled | Enable or disable generation of cropped face images. | True | | --output-face-crop-padding | Padding around face bounding box (0.0-1.0). | 0.3 |

Related Skills

View on GitHub
GitHub Stars158
CategoryContent
Updated2d ago
Forks18

Languages

Python

Security Score

80/100

Audited on Mar 30, 2026

No findings