SkillAgentSearch skills...

Macocr

A CLI OCR tool using Apple’s Vision framework for macOS 13.0+

Install / Use

/learn @riddleling/Macocr
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

macocr

An OCR Tool using Apple's Vision Framework API.

Command Line Arguments

OCR Tool using Vision Framework API

Usage: macocr [OPTIONS] [FILES]...

Arguments:
  [FILES]...  Input files

Options:
  -o, --ocr          OCR and export text files
  -s, --server       Run HTTP Server
  -a, --auth <AUTH>  HTTP Basic Auth (username:password) [default: ]
  -p, --port <PORT>  HTTP port number [default: 8000]
  -h, --help         Print help
  -V, --version      Print version

How to use

Read images and perform OCR, then output the result to stdout

macocr *.png

Read images and perform OCR, then output the result to text files

macocr -o *.png

Start the OCR HTTP server and specify the HTTP port

macocr -s -p 80

Start the OCR HTTP server and configure HTTP Basic Auth

macocr -s -a admin:password123 -p 80

After starting the HTTP server, you can upload an image from the homepage HTML or use curl to send an image via the upload API:

curl -u admin:password123 \
  -H "Accept: application/json" \
  -X POST http://localhost:80/upload \
  -F "file=@01.png"

The JSON response looks like this:

{
    "success": true,
    "message": "File uploaded successfully",
    "ocr_result": "Hello\nWorld\n",
    "image_width": 1247,
    "image_height": 648,
    "ocr_boxes": [
        {
            "text": "Hello",
            "x": 429.5830268255751,
            "y": 267.7961617530676,
            "w": 201.98298336909374,
            "h": 72.4076766967774,
            "rect": {
                "top_left_x": 429.5830268255751,
                "top_left_y": 268.2039872230384,
                "top_right_x": 631.4207561940295,
                "top_right_y": 267.7961617530676,
                "bottom_right_x": 631.5660101946688,
                "bottom_right_y": 339.7960129798743,
                "bottom_left_x": 429.7282808262144,
                "bottom_left_y": 340.203838449845
            }
        },
        {
            "text": "World",
            "x": 421.6618595339102,
            "y": 417.99999973333337,
            "w": 251.79807692307696,
            "h": 80.0,
            "rect": {
                "top_left_x": 421.6618595339102,
                "top_left_y": 417.99999973333337,
                "top_right_x": 673.4599364569872,
                "top_right_y": 417.99999973333337,
                "bottom_right_x": 673.4599364569872,
                "bottom_right_y": 497.99999973333337,
                "bottom_left_x": 421.6618595339102,
                "bottom_left_y": 497.99999973333337
            }
        }
    ]
}

image_width and image_height represent the width and height of the image (in px), x and y represent the top-left origin of the text bounding box (in px), w and h represent the width and height of the text bounding box (in px), rect provides the four corner coordinates of the detected text region, preserving its original orientation (non-axis-aligned).

Installation

Install by cargo

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install macocr
cargo install macocr
macocr -h

Features

  • Directly invoke Apple's Vision Framework API for OCR
  • Command-line mode: allows batch processing of image files and exports OCR results as TXT files
  • HTTP server mode: provides a web interface to upload images and return OCR results
  • Supports both HTML form upload and API interfaces
  • Configurable HTTP Basic Auth authentication
  • The maximum upload image size is 100 MB

Use cases

  • macOS users need to perform batch OCR processing
  • Applications that need to integrate OCR functionality via API

License

MIT License

View on GitHub
GitHub Stars53
CategoryDevelopment
Updated9d ago
Forks2

Languages

Rust

Security Score

85/100

Audited on Mar 18, 2026

No findings