Personfromvid

AI-powered video frame extraction tool that automatically identifies and extracts high-quality frames containing people, with intelligent pose categorization (standing/sitting/squatting), head orientation detection, and shot type classification. Features GPU acceleration, resumable processing, and extensive configuration options.

Generate Convert Improve

Install / Use

/learn @codeprimate/Personfromvid

About this skill

Quality Score

0/100

README

Person From Vid

AI-powered video frame extraction and pose categorization tool designed for creating high-quality training datasets. Analyzes video files to identify and extract consistent, well-posed frames containing people - perfect for LoRA training, face datasets, and machine learning applications.

Features

🎯 Dataset Creation Focused

📐 Consistent Crop Formats: Generate uniform square (1:1), portrait (4:3), or widescreen (16:9) crops perfect for ML training
✨ AI-Powered Face Restoration: GFPGAN enhancement automatically improves face quality in your dataset
🔄 Resumable Batch Processing: Process hundreds of videos reliably - interruptions automatically resume where they left off
📊 Quality-Filtered Output: Only saves high-quality, well-posed frames using advanced blur, brightness, and contrast metrics

🤖 Advanced AI Analysis

🎥 Multi-Format Video Support: Works with MP4, AVI, MOV, MKV, WebM, and more
🧠 Intelligent Detection: Uses state-of-the-art models for face detection (yolov8s-face), pose estimation (yolov8s-pose), and head pose analysis (sixdrepnet)
📐 Pose & Shot Classification: Automatically categorizes poses (standing, sitting, squatting) and shot types (closeup, medium shot, full body)
👤 Head Orientation Analysis: Classifies head directions into 9 cardinal orientations (front, profile, looking up/down, etc.)

⚡ Performance & Workflow

🚀 GPU Acceleration: Optional CUDA/MPS support for significantly faster processing of large datasets
🧠 Smart Frame Selection: Keyframe detection, temporal sampling, and deduplication ensure diverse, high-quality results
📊 Rich Progress Tracking: Modern console interface with real-time progress displays
⚙️ Highly Configurable: Extensive configuration options via CLI, YAML files, or environment variables

Installation

Prerequisites

Python 3.10 or higher
FFmpeg (for video processing)

Installing FFmpeg

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

Windows: Download from FFmpeg official website or use:

choco install ffmpeg  # Using Chocolatey

Install Person From Vid

From PyPI

The recommended way to install is via pip:

pip install personfromvid

From Source

Install uv, then clone and sync:

git clone https://github.com/personfromvid/personfromvid.git
cd personfromvid
uv sync

For an editable install with development dependencies (tests, linters, build tools):

uv sync --extra dev

Without uv, you can still use a virtualenv and pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Quick Start

# Standard square crops
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration

# Specific resolution
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --resize 512 --output-dir ./dataset

# Portrait-oriented crops with enhanced faces
personfromvid video.mp4 --crop-ratio 4:3 --face-restoration --resize 768

# Batch processing multiple videos to the same dataset directory
personfromvid video1.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video2.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset
personfromvid video3.mp4 --crop-ratio 1:1 --face-restoration --output-dir ./my_dataset

Resume & Processing Control

# Normal processing automatically resumes from where it left off
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration

# Force restart from beginning (clears previous state)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --force

# Keep extracted frames and data files between runs (useful for incremental processing)
personfromvid video.mp4 --crop-ratio 1:1 --face-restoration --keep-temp

💡 Pro Tips for Dataset Creation:

Use --crop-ratio 1:1 for square crops (most compatible with ML models)
Enable --face-restoration for higher quality faces in your dataset
Set --resize 512 or --resize 768 for consistent resolutions
Processing is resumable - interrupted sessions automatically continue where they left off
Use --force to restart processing from the beginning when needed

Basic Usage

# Simple processing (saves to video's directory)
personfromvid video.mp4

# Specify output directory
personfromvid video.mp4 --output-dir ./extracted_frames

# Use GPU for faster processing (recommended for large datasets)
personfromvid video.mp4 --device gpu

# Verbose output for monitoring progress
personfromvid video.mp4 --verbose

Advanced Crop Options

# Variable aspect ratio crops (preserve natural proportions)
personfromvid video.mp4 --crop-ratio any --crop-padding 0.2

# Widescreen crops with full frames included
personfromvid video.mp4 --crop-ratio 16:9 --full-frames --output-dir ./widescreen

# Custom padding and high-quality output
personfromvid video.mp4 --crop-ratio 1:1 --crop-padding 0.3 --output-jpg-quality 98

# Large dataset processing with limits
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --face-restoration \
    --max-frames 1000 \
    --batch-size 16 \
    --device gpu

Crop Ratio Options:

1:1 (square): Most common for ML training, ensures consistent dimensions
16:9 (widescreen): Good for cinematic shots, wider context
4:3 (portrait): Better for full-body poses, traditional aspect ratio
any: Preserves natural proportions while applying padding
Omit --crop-ratio: No cropping, outputs full frames only

Processing Control

# Force restart processing (clears previous state)
personfromvid video.mp4 --force

# Keep temporary files for debugging
personfromvid video.mp4 --keep-temp

# Disable face restoration for faster processing
personfromvid video.mp4 --no-face-restoration

# Custom face restoration strength (0.0-1.0)
personfromvid video.mp4 --face-restoration --face-restoration-strength 0.9

Batch Processing Workflows

Creating Large Datasets

For processing multiple videos with consistent settings, create a simple script or use shell commands:

# Process all MP4 files in a directory to create a unified dataset
for video in *.mp4; do
    personfromvid "$video" \
        --crop-ratio 1:1 \
        --face-restoration \
        --resize 512 \
        --output-format jpg \
        --output-dir ./training_dataset
done

# Or using find for recursive processing
find ./videos -name "*.mp4" -exec personfromvid {} \
    --crop-ratio 1:1 \
    --face-restoration \
    --output-dir ./dataset \
    --device gpu \;

Configuration File Approach

For consistent settings across multiple runs, use a configuration file:

# dataset_config.yaml
output:
  image:
    format: "jpg"
    jpg:
      quality: 95
    crop_ratio: "1:1"
    face_restoration_enabled: true
    resize: 512
models:
  device: "gpu"
  batch_size: 8

Then process videos:

personfromvid video1.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video2.mp4 --config dataset_config.yaml --output-dir ./dataset
personfromvid video3.mp4 --config dataset_config.yaml --output-dir ./dataset

Quality Control and Dataset Curation

# High-quality settings for final dataset
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --face-restoration \
    --resize 768 \
    --output-jpg-quality 98 \
    --quality-threshold 0.4 \
    --confidence 0.5 \
    --max-frames-per-category 8

# Quick preview with lower quality for initial review
personfromvid video.mp4 \
    --crop-ratio 1:1 \
    --resize 256 \
    --max-frames 50 \
    --output-dir ./preview

Command-line Options

personfromvid offers many options to customize its behavior. Here are the available options:

General Options

| Option | Alias | Description | Default | | --- | --- | --- | --- | | --config | -c | Path to a YAML or JSON configuration file. | None | | --output-dir | -o | Directory to save output files. | Video's directory | | --log-level | -l | Set logging level (DEBUG, INFO, WARNING, ERROR). | INFO | | --verbose | -v | Enable verbose output (sets log level to DEBUG). | False | | --quiet | -q | Suppress non-essential output. | False | | --no-structured-output | | Disable structured output format (use basic logging). | False | | --version | | Show version information and exit. | False |

AI Model Options

| Option | Description | Default | | --- | --- | --- | | --device | Device to use for AI models (auto, cpu, gpu). | auto | | --batch-size | Batch size for AI model inference (1-64). | 1 | | --confidence | Confidence threshold for detections (0.0-1.0). | 0.3 |

Frame Processing Options

| Option | Description | Default | | --- | --- | --- | | --max-frames | Maximum frames to extract per video. | None | | --quality-threshold | Quality threshold for frame selection (0.0-1.0). | 0.2 |

Output Options

| Option | Description | Default | | --- | --- | --- | | --output-format | Output image format (jpg or png). | jpg | | --output-jpg-quality | Quality for JPG output (70-100). | 95 | | --output-face-crop-enabled / --no-output-face-crop-enabled | Enable or disable generation of cropped face images. | True | | --output-face-crop-padding | Padding around face bounding box (0.0-1.0). | 0.3 |

Related Skills

qqbot-channel

344.1k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

99.8k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

344.1k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Design

Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t

codeprimate

View profile

View on GitHub

GitHub Stars158

CategoryContent

Updated2d ago

Forks18

codeprimate/personfromvid

Languages

Python

Security Score

80/100

Audited on Mar 30, 2026

No findings