Ocrbase

📄 PDF ->.MD/.JSON API & SDK for PaddleOCR-VL with structured data extraction. Self-hostable.

Generate Convert Improve

Install / Use

/learn @ocrbase-hq/Ocrbase

About this skill

Quality Score

0/100

README

ocrbase

Turn PDFs into structured data at scale. Powered by frontier open-weight OCR models.

Quickstart

Try the Playground

ocrbase.dev parse and extract data from documents up to 1K page for free.

Cloud API

Generate an API key at ocrbase.dev
Add it to your .env file:
```
OCRBASE_API_KEY=sk_xxx
```
Install the SDK:
```
npm install ocrbase-sdk
```

Parse a document:

import { parse } from "ocrbase-sdk";

const { text } = await parse("./invoice.pdf");
console.log(text);

Or use the API directly with curl:

curl -X POST https://api.ocrbase.dev/v1/parse \
  -H "Authorization: Bearer sk_xxx" \
  -F "file=@document.pdf"

Self-host

Prerequisites: Bun, Docker Desktop

git clone https://github.com/majcheradam/ocrbase
cd ocrbase
bun install
cp .env.example .env     # then edit .env — set PADDLE_OCR_URL to your PaddleOCR instance
docker compose up -d     # starts postgres, redis, minio
bun run db:push          # set up the database
bun run dev              # start the API server + worker

The API will be available at http://localhost:3000. See the Self-Hosting Guide for PaddleOCR setup, GPU configuration, and all environment variables.

How It Works

ocrbase has two core operations. Both are asynchronous — you submit a request, get a job ID, and retrieve the result when it's ready.

Parse (`POST /v1/parse`)

Converts a PDF into Markdown. Upload a file and ocrbase OCRs every page and returns clean Markdown text.

curl -X POST https://api.ocrbase.dev/v1/parse \
  -H "Authorization: Bearer sk_xxx" \
  -F "file=@document.pdf"

Extract (`POST /v1/extract`)

Converts a PDF into structured JSON. You provide a file and a schema ID, and ocrbase OCRs the document then uses an LLM to extract data matching your schema.

curl -X POST https://api.ocrbase.dev/v1/extract \
  -H "Authorization: Bearer sk_xxx" \
  -F "file=@invoice.pdf" \
  -F "schemaId=inv_schema_123"

Checking Results

Polling — fetch the job status until it completes:

curl https://api.ocrbase.dev/v1/jobs/job_xxx \
  -H "Authorization: Bearer sk_xxx"

WebSocket — subscribe to real-time status updates instead of polling:

wscat -c "wss://api.ocrbase.dev/v1/realtime?job_id=job_xxx" \
  -H "Authorization: Bearer sk_xxx"

Features

Best-in-class OCR — uses PaddleOCR-VL-1.5 0.9B for accurate text extraction from PDFs
Structured extraction — define a JSON schema and get structured data back from any document
Built for scale — queue-based job processing with BullMQ so you can process thousands of documents
Real-time updates — WebSocket notifications for job progress instead of polling
Self-hostable — run the entire stack on your own infrastructure with Docker

SDK

Install the TypeScript SDK from npm:

npm install ocrbase-sdk

ocrbase-sdk on npm | Source on GitHub

The SDK provides type-safe methods for parsing, extraction, schema management, and real-time WebSocket subscriptions.

API Reference

Interactive OpenAPI UI: https://api.ocrbase.dev/openapi
OpenAPI JSON: https://api.ocrbase.dev/openapi/json

LLM Integration

Parse documents with ocrbase before sending to LLMs. Raw PDF binary wastes tokens and produces poor results — sending clean Markdown from ocrbase gives much better LLM output at a fraction of the cost.

Architecture

Architecture Diagram

Tech Stack

| Layer | Technology | | ------------- | ------------------------------------------------------------- | | Runtime | Bun | | API Framework | Elysia | | SDK | Eden Treaty | | Database | PostgreSQL + Drizzle ORM | | Queue | Redis + BullMQ | | Storage | S3/MinIO | | OCR | PaddleOCR-VL 1.5 | | Auth | Better-Auth | | Build | Turborepo |

Self-Hosting

See the Self-Hosting Guide for the full deployment walkthrough including PaddleOCR setup, all environment variables, and API endpoint reference.

Requirements: Bun, Docker Desktop

Health Checks

GET /v1/health/live — liveness check
GET /v1/health/ready — readiness check (confirms all dependencies are connected)

Star History

License

MIT — See LICENSE for details.

Contact

For API access, on-premise deployment, or questions: adammajcher20@gmail.com

Related Skills

canvas

351.8k

Canvas Skill Display HTML content on connected OpenClaw nodes (Mac app, iOS, Android). Overview The canvas tool lets you present web content on any connected node's canvas view. Great for: -

node-connect

351.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

Writing Hookify Rules

110.9k

This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.

ocrbase-hq

View profile

View on GitHub

GitHub Stars986

CategoryDevelopment

Updated4h ago

Forks76

ocrbase-hq/ocrbase

Languages

TypeScript

Security Score

100/100

Audited on Apr 8, 2026

No findings