SkillAgentSearch skills...

SwiftPixelUtils

High-performance Swift library for ML image preprocessing & postprocessing. One-line APIs for TFLite, CoreML, PyTorch/ExecuTorch. Supports classification, detection (YOLO), segmentation (DeepLabV3), quantization, augmentation, and visualization.

Install / Use

/learn @manishkumar03/SwiftPixelUtils
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SwiftPixelUtils

<p align="center"> <img src="https://img.shields.io/badge/Swift-5.9-orange" alt="Swift"> <img src="https://img.shields.io/badge/platforms-iOS%2015%2B%20%7C%20macOS%2012%2B-blue" alt="Platforms"> <img src="https://img.shields.io/badge/SPM-compatible-brightgreen" alt="SPM"> <img src="https://img.shields.io/github/license/manishkumar03/SwiftPixelUtils" alt="License"> </p> <p align="center"> <strong>High-performance Swift library for image preprocessing optimized for ML/AI inference pipelines on iOS/macOS</strong> </p> <p align="center"> <a href="#installation">Installation</a> • <a href="#quick-start">Quick Start</a> • <a href="#api-reference">API Reference</a> • <a href="#features">Features</a> • <a href="docs/README.md">📚 Docs</a> </p>

High-performance Swift library for image preprocessing optimized for ML/AI inference pipelines. Native implementations using Core Image, Accelerate, and Core ML for pixel extraction, tensor conversion, quantization, augmentation, and model-specific preprocessing (YOLO, MobileNet, etc.)

✨ Features

Core Preprocessing

  • 🚀 High Performance: Native implementations using Apple frameworks (Core Image, Accelerate, vImage, Core ML)
  • 🔢 Raw Pixel Data: Extract pixel values as typed arrays (Float, Float16, Int32, UInt8) ready for ML inference
  • 🎨 Multiple Color Formats: RGB, RGBA, BGR, BGRA, Grayscale, HSV, HSL, LAB, YUV, YCbCr
  • 📐 Flexible Resizing: Cover, contain, stretch, and letterbox strategies with automatic transform metadata
  • ✂️ ROI Pipeline: Crop → resize → normalize in a single call via ROI options
  • 🔢 ML-Ready Normalization: ImageNet, TensorFlow, custom presets
  • 📊 Multiple Data Layouts: HWC, CHW, NHWC, NCHW (PyTorch/TensorFlow compatible)
  • 🖼️ Multiple Sources: local file URLs, data, base64, assets, photo library
  • 📱 Orientation Handling: Opt-in UIImage/EXIF orientation normalization to fix silent rotation issues

ML Framework Integration

  • 🤖 Simplified ML APIs: One-line preprocessing (getModelInput) and postprocessing (ClassificationOutput, DetectionOutput, SegmentationOutput, DepthEstimationOutput) for all major frameworks
  • 🤖 Model Presets: Pre-configured settings for YOLO (v8/v9/v10), RT-DETR, MobileNet, EfficientNet, ResNet, ViT, CLIP, SAM/SAM2, DINO, DETR, Mask2Former, UNet, DeepLab, SegFormer, FCN, PSPNet
  • 🎯 Framework Targets: Automatic configuration for PyTorch, TensorFlow, TFLite, CoreML, ONNX Runtime, ExecuTorch, OpenCV
  • 🏷️ Label Database: Built-in labels for COCO, ImageNet, VOC, CIFAR, Places365, ADE20K, Open Images, LVIS, Objects365, Kinetics

ONNX Runtime Integration

  • 🔌 ONNX Helper: Streamlined tensor creation for ONNX Runtime with ONNXHelper.createTensorData()
  • 📊 ONNX Data Types: Support for Float32, Float16, UInt8, Int8, Int32, Int64 tensor types
  • 🎯 ONNX Model Configs: Pre-configured settings for YOLOv8, RT-DETR, ResNet, MobileNetV2, ViT, CLIP
  • 🔍 Output Parsing: Built-in parsers for YOLOv8, YOLOv5, RT-DETR, SSD detection outputs
  • 🧮 Segmentation Output: Parse ONNX segmentation model outputs with argmax

Quantization

  • 🎯 Native Quantization: Float→Int8/UInt8/Int16/INT4 with per-tensor and per-channel support (TFLite/ExecuTorch compatible)
  • 🔢 INT4 Quantization: 4-bit quantization (8× compression) for LLM weights and edge deployment
  • 📊 Per-Channel Quantization: Channel-wise scale/zeroPoint for higher accuracy (CNN, Transformer weights)
  • 🔄 Float16 Conversion: IEEE 754 half-precision ↔ Float32 utilities for CVPixelBuffer processing
  • 🎥 CVPixelBuffer Formats: BGRA/RGBA, NV12, and RGB565 conversion to tensor data

Detection & Segmentation

  • 📦 Bounding Box Utilities: Format conversion (xyxy/xywh/cxcywh), scaling, clipping, IoU, NMS
  • 🖼️ Letterbox Padding: YOLO-style letterbox preprocessing with automatic transform metadata for reverse coordinate mapping
  • 📏 Depth Estimation: Process MiDaS, DPT, ZoeDepth, Depth Anything outputs with scientific colormaps (Viridis, Plasma, Turbo) and custom colormaps

Data Augmentation

  • 🔄 Image Augmentation: Rotation, flip, brightness, contrast, saturation, blur
  • 🎨 Color Jitter: Granular brightness/contrast/saturation/hue control with range support and seeded randomness
  • ✂️ Cutout/Random Erasing: Mask random regions with constant/noise fill for robustness training
  • 🎲 Random Crop with Seed: Reproducible random crops for data augmentation pipelines

Tensor Operations

  • 🧮 Tensor Operations: Channel extraction, patch extraction, permutation, batch concatenation
  • 🔙 Tensor to Image: Convert processed tensors back to images
  • 🔲 Grid/Patch Extraction: Extract image patches in grid patterns for sliding window inference
  • Tensor Validation: Validate tensor shapes, dtypes, and value ranges before inference
  • 📦 Batch Assembly: Combine multiple images into NCHW/NHWC batch tensors
  • 📦 Batch Processing: Process multiple images with concurrency control

Visualization & Analysis

  • 🎨 Drawing/Visualization: Draw boxes, keypoints, masks, and heatmaps for debugging
  • 📈 Image Analysis: Statistics, metadata, validation, blur detection

📱 Example App

A comprehensive iOS example app is included in the Example/ directory, demonstrating all major features:

  • TensorFlow Lite Classification - MobileNetV2 with TopK results
  • ExecuTorch Classification - MobileNetV3 with TopK results
  • Object Detection - YOLOv8 with NMS and bounding box visualization
  • Semantic Segmentation - DeepLabV3 with colored mask overlay
  • Depth Estimation - Depth Anything with colormaps and overlay visualization
  • Pixel Extraction - Model presets (YOLO, RT-DETR, MobileNet, ResNet, ViT, CLIP, SAM2, DeepLab, etc.) and custom options
  • Bounding Box Utilities - Format conversion, IoU calculation, NMS, scaling, clipping
  • Image Augmentation - Rotation, flip, brightness, contrast, saturation, blur
  • Tensor Operations - Channel extraction, permutation, batch assembly
  • Drawing & Visualization - Boxes, labels, masks, and overlays
  • Comprehensive UI Tests - 50+ UI tests covering all features
<p align="center"> <img src="SupportingFiles/example-app-screenshot.png" alt="Example App Screenshot" width="300"> </p>

To run the example app:

cd Example/SwiftPixelUtilsExampleApp
pod install
open SwiftPixelUtilsExampleApp.xcworkspace

To run UI tests, select the SwiftPixelUtilsExampleAppUITests target and press ⌘U.

📦 Installation

Swift Package Manager

Add SwiftPixelUtils to your Package.swift:

dependencies: [
    .package(url: "https://github.com/manishkumar03/SwiftPixelUtils.git", from: "1.0.0")
]

Or add it via Xcode:

  1. File → Add Package Dependencies
  2. Enter the repository URL
  3. Select version/branch

🚀 Quick Start

⚠️ Important: SwiftPixelUtils functions are synchronous and use throws (not async throws). Remote URLs (http, https) are not supported - download images first and use .data(Data) or .file(URL) instead.

Raw Pixel Data Extraction

import SwiftPixelUtils

// From local file
let result = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: PixelDataOptions()
)

print(result.data) // Float array of pixel values
print(result.width) // Image width
print(result.height) // Image height
print(result.shape) // [height, width, channels]

// From downloaded data (for remote images)
let (data, _) = try await URLSession.shared.data(from: URL(string: "https://example.com/image.jpg")!)
let result2 = try PixelExtractor.getPixelData(
    source: .data(data),
    options: PixelDataOptions()
)

Using Model Presets

import SwiftPixelUtils

// Use pre-configured YOLO settings
let result = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: ModelPresets.yolov8
)
// Automatically configured: 640x640, letterbox resize, RGB, scale normalization, NCHW layout

// Or MobileNet
let mobileNetResult = try PixelExtractor.getPixelData(
    source: .file(URL(fileURLWithPath: "/path/to/image.jpg")),
    options: ModelPresets.mobilenet
)
// Configured: 224x224, cover resize, RGB, ImageNet normalization, NHWC layout

Available Model Presets

Classification Models

| Preset | Size | Resize | Normalization | Layout | |--------|------|--------|---------------|--------| | mobilenet / mobilenet_v2 / mobilenet_v3 | 224×224 | cover | ImageNet | NHWC | | efficientnet | 224×224 | cover | ImageNet | NHWC | | resnet / resnet50 | 224×224 | cover | ImageNet | NCHW | | vit | 224×224 | cover | ImageNet | NCHW | | clip | 224×224 | cover | CLIP-specific | NCHW | | dino | 224×224 | cover | ImageNet | NCHW |

Detection Models

| Preset | Size | Resize | Normalization | Layout | Notes | |--------|------|--------|---------------|--------|-------| | yolo / yolov8 | 640×640 | letterbox | scale | NCHW | Standard YOLO | | yolov9 | 640×640 | letterbox | scale | NCHW | PGI + GELAN | | yolov10 / yolov10_n/s/m/l/x | 640×640 | letterbox | scale | NCHW | NMS-free | | rtdetr / rtdetr_l / rtdetr_x | 640×640 | letterbox | scale | NCHW | Real-time DETR | | detr | 800×800 | contain | ImageNet | NCHW | Transformer detection |

Segmentation Models

| Preset | Size | Resize | Normalization | Layout | Notes | |--------|------|--------|---------------|--------|-------| | sam | 1024×1024 | contain | ImageNet | NCHW | Segment Anything | | sam2 / sam2_t/s/b_plus/l | 1024×1024 | contain | ImageNet | NCHW | SAM2 with video | | mask2former / mask2former_swin_t/l | 512×512 | contain | ImageNet | NCHW | Universal segmentation | | deeplab / deeplabv3 / deeplabv3_plus | 513×513 | contain | ImageNet | NCHW | ASPP module | | deeplab_769 / deeplab_1025 |

View on GitHub
GitHub Stars6
CategoryCustomer
Updated1mo ago
Forks0

Languages

Swift

Security Score

85/100

Audited on Feb 16, 2026

No findings