SwiftPixelUtils

<img src="https://img.shields.io/badge/Swift-5.9-orange" alt="Swift"> <img src="https://img.shields.io/badge/platforms-iOS%2015%2B%20%7C%20macOS%2012%2B-blue" alt="Platforms"> <img src="https://img.shields.io/badge/SPM-compatible-brightgreen" alt="SPM"> <img src="https://img.shields.io/github/license/manishkumar03/SwiftPixelUtils" alt="License"> High-performance Swift library for image preprocessing optimized for ML/AI inference pipelines on iOS/macOS <a href="#installation">Installation</a> • <a href="#quick-start">Quick Start</a> • <a href="#api-reference">API Reference</a> • <a href="#features">Features</a> • <a href="docs/README.md">📚 Docs</a>

High-performance Swift library for image preprocessing optimized for ML/AI inference pipelines. Native implementations using Core Image, Accelerate, and Core ML for pixel extraction, tensor conversion, quantization, augmentation, and model-specific preprocessing (YOLO, MobileNet, etc.)

✨ Features

Core Preprocessing

🚀 High Performance: Native implementations using Apple frameworks (Core Image, Accelerate, vImage, Core ML)
🔢 Raw Pixel Data: Extract pixel values as typed arrays (Float, Float16, Int32, UInt8) ready for ML inference
🎨 Multiple Color Formats: RGB, RGBA, BGR, BGRA, Grayscale, HSV, HSL, LAB, YUV, YCbCr
📐 Flexible Resizing: Cover, contain, stretch, and letterbox strategies with automatic transform metadata
✂️ ROI Pipeline: Crop → resize → normalize in a single call via ROI options
🔢 ML-Ready Normalization: ImageNet, TensorFlow, custom presets
📊 Multiple Data Layouts: HWC, CHW, NHWC, NCHW (PyTorch/TensorFlow compatible)
🖼️ Multiple Sources: local file URLs, data, base64, assets, photo library
📱 Orientation Handling: Opt-in UIImage/EXIF orientation normalization to fix silent rotation issues

ML Framework Integration

🤖 Simplified ML APIs: One-line preprocessing (getModelInput) and postprocessing (ClassificationOutput, DetectionOutput, SegmentationOutput, DepthEstimationOutput) for all major frameworks
🤖 Model Presets: Pre-configured settings for YOLO (v8/v9/v10), RT-DETR, MobileNet, EfficientNet, ResNet, ViT, CLIP, SAM/SAM2, DINO, DETR, Mask2Former, UNet, DeepLab, SegFormer, FCN, PSPNet
🎯 Framework Targets: Automatic configuration for PyTorch, TensorFlow, TFLite, CoreML, ONNX Runtime, ExecuTorch, OpenCV
🏷️ Label Database: Built-in labels for COCO, ImageNet, VOC, CIFAR, Places365, ADE20K, Open Images, LVIS, Objects365, Kinetics

ONNX Runtime Integration

🔌 ONNX Helper: Streamlined tensor creation for ONNX Runtime with ONNXHelper.createTensorData()
📊 ONNX Data Types: Support for Float32, Float16, UInt8, Int8, Int32, Int64 tensor types
🎯 ONNX Model Configs: Pre-configured settings for YOLOv8, RT-DETR, ResNet, MobileNetV2, ViT, CLIP
🔍 Output Parsing: Built-in parsers for YOLOv8, YOLOv5, RT-DETR, SSD detection outputs
🧮 Segmentation Output: Parse ONNX segmentation model outputs with argmax

Quantization

🎯 Native Quantization: Float→Int8/UInt8/Int16/INT4 with per-tensor and per-channel support (TFLite/ExecuTorch compatible)
🔢 INT4 Quantization: 4-bit quantization (8× compression) for LLM weights and edge deployment
📊 Per-Channel Quantization: Channel-wise scale/zeroPoint for higher accuracy (CNN, Transformer weights)
🔄 Float16 Conversion: IEEE 754 half-precision ↔ Float32 utilities for CVPixelBuffer processing
🎥 CVPixelBuffer Formats: BGRA/RGBA, NV12, and RGB565 conversion to tensor data

Detection & Segmentation

📦 Bounding Box Utilities: Format conversion (xyxy/xywh/cxcywh), scaling, clipping, IoU, NMS
🖼️ Letterbox Padding: YOLO-style letterbox preprocessing with automatic transform metadata for reverse coordinate mapping
📏 Depth Estimation: Process MiDaS, DPT, ZoeDepth, Depth Anything outputs with scientific colormaps (Viridis, Plasma, Turbo) and custom colormaps

Data Augmentation

🔄 Image Augmentation: Rotation, flip, brightness, contrast, saturation, blur
🎨 Color Jitter: Granular brightness/contrast/saturation/hue control with range support and seeded randomness
✂️ Cutout/Random Erasing: Mask random regions with constant/noise fill for robustness training
🎲 Random Crop with Seed: Reproducible random crops for data augmentation pipelines

Tensor Operations

🧮 Tensor Operations: Channel extraction, patch extraction, permutation, batch concatenation
🔙 Tensor to Image: Convert processed tensors back to images
🔲 Grid/Patch Extraction: Extract image patches in grid patterns for sliding window inference
✅ Tensor Validation: Validate tensor shapes, dtypes, and value ranges before inference
📦 Batch Assembly: Combine multiple images into NCHW/NHWC batch tensors
📦 Batch Processing: Process multiple images with concurrency control

Visualization & Analysis

🎨 Drawing/Visualization: Draw boxes, keypoints, masks, and heatmaps for debugging
📈 Image Analysis: Statistics, metadata, validation, blur detection

📱 Example App

A comprehensive iOS example app is included in the Example/ directory, demonstrating all major features:

TensorFlow Lite Classification - MobileNetV2 with TopK results
ExecuTorch Classification - MobileNetV3 with TopK results
Object Detection - YOLOv8 with NMS and bounding box visualization
Semantic Segmentation - DeepLabV3 with colored mask overlay
Depth Estimation - Depth Anything with colormaps and overlay visualization
Pixel Extraction - Model presets (YOLO, RT-DETR, MobileNet, ResNet, ViT, CLIP, SAM2, DeepLab, etc.) and custom options
Bounding Box Utilities - Format conversion, IoU calculation, NMS, scaling, clipping
Image Augmentation - Rotation, flip, brightness, contrast, saturation, blur
Tensor Operations - Channel extraction, permutation, batch assembly
Drawing & Visualization - Boxes, labels, masks, and overlays
Comprehensive UI Tests - 50+ UI tests covering all features

To run the example app:

cd Example/SwiftPixelUtilsExampleApp
pod install
open SwiftPixelUtilsExampleApp.xcworkspace

To run UI tests, select the SwiftPixelUtilsExampleAppUITests target and press ⌘U.

📦 Installation

Swift Package Manager

Add SwiftPixelUtils to your Package.swift:

dependencies: [
    .package(url: "https://github.com/manishkumar03/SwiftPixelUtils.git", from: "1.0.0")
]

Or add it via Xcode:

File → Add Package Dependencies
Enter the repository URL
Select version/branch

🚀 Quick Start

⚠️ Important: SwiftPixelUtils functions are synchronous and use throws (not async throws). Remote URLs (http, https) are not supported - download images first and use .data(Data) or .file(URL) instead.

Raw Pixel Data Extraction

import SwiftPixelUtils

// From local file
let result = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: PixelDataOptions()
)

print(result.data) // Float array of pixel values
print(result.width) // Image width
print(result.height) // Image height
print(result.shape) // [height, width, channels]

// From downloaded data (for remote images)
let (data, _) = try await URLSession.shared.data(from: URL(string: "https://example.com/image.jpg")!)
let result2 = try PixelExtractor.getPixelData(
    source: .data(data),
    options: PixelDataOptions()
)

Using Model Presets

import SwiftPixelUtils

// Use pre-configured YOLO settings
let result = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: ModelPresets.yolov8
)
// Automatically configured: 640x640, letterbox resize, RGB, scale normalization, NCHW layout

// Or MobileNet
let mobileNetResult = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: ModelPresets.mobilenet
)
// Configured: 224x224, cover resize, RGB, ImageNet normalization, NHWC layout

Available Model Presets

Classification Models

| Preset | Size | Resize | Normalization | Layout | |--------|------|--------|---------------|--------| | mobilenet / mobilenet_v2 / mobilenet_v3 | 224×224 | cover | ImageNet | NHWC | | efficientnet | 224×224 | cover | ImageNet | NHWC | | resnet / resnet50 | 224×224 | cover | ImageNet | NCHW | | vit | 224×224 | cover | ImageNet | NCHW | | clip | 224×224 | cover | CLIP-specific | NCHW | | dino | 224×224 | cover | ImageNet | NCHW |

Detection Models

| Preset | Size | Resize | Normalization | Layout | Notes | |--------|------|--------|---------------|--------|-------| | yolo / yolov8 | 640×640 | letterbox | scale | NCHW | Standard YOLO | | yolov9 | 640×640 | letterbox | scale | NCHW | PGI + GELAN | | yolov10 / yolov10_n/s/m/l/x | 640×640 | letterbox | scale | NCHW | NMS-free | | rtdetr / rtdetr_l / rtdetr_x | 640×640 | letterbox | scale | NCHW | Real-time DETR | | detr | 800×800 | contain | ImageNet | NCHW | Transformer detection |

Segmentation Models

| Preset | Size | Resize | Normalization | Layout | Notes | |--------|------|--------|---------------|--------|-------| | sam | 1024×1024 | contain | ImageNet | NCHW | Segment Anything | | sam2 / sam2_t/s/b_plus/l | 1024×1024 | contain | ImageNet | NCHW | SAM2 with video | | mask2former / mask2former_swin_t/l | 512×512 | contain | ImageNet | NCHW | Universal segmentation | | deeplab / deeplabv3 / deeplabv3_plus | 513×513 | contain | ImageNet | NCHW | ASPP module | | deeplab_769 / deeplab_1025 |

SwiftPixelUtils

Install / Use

README

SwiftPixelUtils

✨ Features

Core Preprocessing

ML Framework Integration

ONNX Runtime Integration

Quantization

Detection & Segmentation

Data Augmentation

Tensor Operations

Visualization & Analysis

📱 Example App

📦 Installation

Swift Package Manager

🚀 Quick Start

Raw Pixel Data Extraction

Using Model Presets

Available Model Presets