macocr

An OCR Tool using Apple's Vision Framework API.

Command Line Arguments

OCR Tool using Vision Framework API

Usage: macocr [OPTIONS] [FILES]...

Arguments:
  [FILES]...  Input files

Options:
  -o, --ocr          OCR and export text files
  -s, --server       Run HTTP Server
  -a, --auth <AUTH>  HTTP Basic Auth (username:password) [default: ]
  -p, --port <PORT>  HTTP port number [default: 8000]
  -h, --help         Print help
  -V, --version      Print version

How to use

Read images and perform OCR, then output the result to stdout

macocr *.png

Read images and perform OCR, then output the result to text files

macocr -o *.png

Start the OCR HTTP server and specify the HTTP port

macocr -s -p 80

Start the OCR HTTP server and configure HTTP Basic Auth

macocr -s -a admin:password123 -p 80

After starting the HTTP server, you can upload an image from the homepage HTML or use curl to send an image via the upload API:

curl -u admin:password123 \
  -H "Accept: application/json" \
  -X POST http://localhost:80/upload \
  -F "file=@01.png"

The JSON response looks like this:

{
    "success": true,
    "message": "File uploaded successfully",
    "ocr_result": "Hello\nWorld\n",
    "image_width": 1247,
    "image_height": 648,
    "ocr_boxes": [
        {
            "text": "Hello",
            "x": 429.5830268255751,
            "y": 267.7961617530676,
            "w": 201.98298336909374,
            "h": 72.4076766967774,
            "rect": {
                "top_left_x": 429.5830268255751,
                "top_left_y": 268.2039872230384,
                "top_right_x": 631.4207561940295,
                "top_right_y": 267.7961617530676,
                "bottom_right_x": 631.5660101946688,
                "bottom_right_y": 339.7960129798743,
                "bottom_left_x": 429.7282808262144,
                "bottom_left_y": 340.203838449845
            }
        },
        {
            "text": "World",
            "x": 421.6618595339102,
            "y": 417.99999973333337,
            "w": 251.79807692307696,
            "h": 80.0,
            "rect": {
                "top_left_x": 421.6618595339102,
                "top_left_y": 417.99999973333337,
                "top_right_x": 673.4599364569872,
                "top_right_y": 417.99999973333337,
                "bottom_right_x": 673.4599364569872,
                "bottom_right_y": 497.99999973333337,
                "bottom_left_x": 421.6618595339102,
                "bottom_left_y": 497.99999973333337
            }
        }
    ]
}

image_width and image_height represent the width and height of the image (in px), x and y represent the top-left origin of the text bounding box (in px), w and h represent the width and height of the text bounding box (in px), rect provides the four corner coordinates of the detected text region, preserving its original orientation (non-axis-aligned).

Installation

Install by cargo

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install macocr
cargo install macocr
macocr -h

Features

Directly invoke Apple's Vision Framework API for OCR
Command-line mode: allows batch processing of image files and exports OCR results as TXT files
HTTP server mode: provides a web interface to upload images and return OCR results
Supports both HTML form upload and API interfaces
Configurable HTTP Basic Auth authentication
The maximum upload image size is 100 MB

Use cases

macOS users need to perform batch OCR processing
Applications that need to integrate OCR functionality via API

License

MIT License

Macocr

Install / Use

README