SkillAgentSearch skills...

KhmerOCR

A Fast Khmer Optical Character Recognition (KhmerOCR)

Install / Use

/learn @seanghay/KhmerOCR
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

KhmerOCR

A high-performance Khmer Optical Character Recognition (OCR) engine tailored for documents. This model was trained on 3 million text lines using over 800+ Khmer fonts to ensure robust recognition across various styles and weights.

[!IMPORTANT] Update: The library now supports full document processing, layout detection, and multi-format exports (PDF, DOCX, HTML, Markdown).


Features

  • Fast: Optimized for Khmer script using ONNX Runtime for fast inference
  • Native C++ Engine: High-performance C/C++ implementation with C API for FFI bindings
  • Font Detection: Automatically identifies and preserves Moul vs. Regular font styles
  • Multi-format Export: Convert images or PDFs into editable .docx, .md, .html, or .txt files
  • PDF Support: High-resolution PDF rendering and processing via PyMuPDF
  • Cross-Platform: Supports macOS, Linux, Windows, iOS, and Android

Installation

Python

pip install git+https://github.com/seanghay/KhmerOCR

C++ Library

See cpp/README.md for build instructions.

cd cpp
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Usage

Python CLI

For single images or documents, run:

khmerocr document.jpg --format docx

Python CLI Options

| Option | Shortcut | Description | Default | | ---------- | -------- | ------------------------------------------ | ------------------------- | | --output | -o | Custom output path | input_filename.{format} | | --format | -f | Output format: txt, html, docx, md | txt |

C++ CLI

The C++ CLI is a lightweight inference tool focused on text extraction. For document formatting (DOCX, HTML, etc.), use the Python CLI.

# Full OCR (detect + recognize)
./cpp/build/khmerocr image.png

# JSON output
./cpp/build/khmerocr -j image.png

# Detection only
./cpp/build/khmerocr -d image.png

# Recognition only (for pre-cropped text images)
./cpp/build/khmerocr -r cropped_text.png

# Verbose output with confidence scores
./cpp/build/khmerocr -v image.png

| Option | Shortcut | Description | |--------|----------|-------------| | --json | -j | Output results in JSON format | | --detect-only | -d | Only detect text regions, skip recognition | | --recognize-only | -r | Only recognize (skip detection) | | --verbose | -v | Show confidence scores | | --model-dir | -m | Custom model directory path |

Example Output

When processing a line, the engine provides rich metadata:

{
  "text": "លទ្ធផលនៃការធ្វើកំណែទប្រង់លើទូរគមនាគមន៍កម្ពុជា",
  "text_confidence": 0.9804,
  "font": "Moul",
  "font_confidence": 0.9999
}


Examples

| Input | Detected Text | Font Style | | ---------- | ---------------- | ---------- | | [Line 1] | យេម៉ែនលង់ក្នុងសង្គ្រាម... | Bold | | [Line 2] | ក្រសួងមហាផ្ទៃឱ្យត្រៀម... | Bold | | [Line 3] | លទ្ធផលនៃការធ្វើកំណែ... | Moul |


Milestones

  • [x] Basic Font Style Detection
  • [x] Multi-line Document Support
  • [x] Export to DOCX/HTML/Markdown
  • [ ] Add English & Symbol support
  • [x] Add ONNXRuntime for faster inference
  • [x] Add C/C++ Inference Engine

License

Distributed under the MIT License. See the LICENSE file for more information.


Contact

Seanghay Yath


<div align="center"> <a href="[https://khmerscan.com/](https://khmerscan.com/)"> <img width="80" src="https://khmerscan.com/favicon.svg" alt="KhmerScan Logo"> </a> <p>Sponsored by <a href="[https://khmerscan.com/](https://khmerscan.com/)">KhmerScan</a>

(បម្លែងរូបភាពទៅជាអត្ថបទខ្មែរ)</p>

</div>

Related Skills

View on GitHub
GitHub Stars48
CategoryDevelopment
Updated17d ago
Forks9

Languages

C++

Security Score

95/100

Audited on Mar 16, 2026

No findings