OpenOCR
OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
Install / Use
/learn @Topdu/OpenOCRREADME
<a href="https://github.com/Topdu/OpenOCR/blob/main/LICENSE"><img alt="license" src="https://img.shields.io/github/license/Topdu/OpenOCR"></a> <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-pink.svg"></a> <a href="https://github.com/Topdu/OpenOCR/graphs/contributors"><img src="https://img.shields.io/github/contributors/Topdu/OpenOCR?color=9ea"></a> <a href="https://pepy.tech/project/openocr"><img src="https://static.pepy.tech/personalized-badge/openocr?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Clone%20downloads"></a> <a href="https://github.com/Topdu/OpenOCR/stargazers"><img src="https://img.shields.io/github/stars/Topdu/OpenOCR?color=ccf"></a> <a href="https://pypi.org/project/openocr-python/"><img alt="PyPI" src="https://img.shields.io/pypi/v/openocr-python"></a> <a href="https://pypi.org/project/openocr-python/"><img src="https://img.shields.io/pypi/dm/openocr-python?label=PyPI%20downloads"></a>
English | 简体中文
</div>OpenOCR is an open-source toolkit developed by the OCR team from FVL Lab, Fudan University, under the guidance of Prof. Yu-Gang Jiang and Prof. Zhineng Chen. It focuses on 「General-OCR」 tasks, including Text Detection and Recognition, Formula and Table Recognition, as well as Document Parsing and Understanding. The toolkit integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
OpenOCR aims to build a comprehensive open-source ecosystem for General-OCR, bridging academic research and real-world applications, and fostering the collaborative development and widespread deployment of OCR technologies across both research frontiers and industrial scenarios. We welcome researchers, developers, and industry partners to explore the toolkit and share feedback.
🚀 Quick Start
Features
-
🔥OpenDoc-0.1B: Ultra-Lightweight Document Parsing System with 0.1B Parameters
-
⚡[Quick Start]
[Local Demo]
- An ultra-lightweight document parsing system with only 0.1B parameters.
- Two-stage pipeline:
- Layout analysis via PP-DocLayoutV2.
- Unified recognition of text, formulas, and tables using the in-house model UniRec-0.1B
- In the original version of UniRec-0.1B, only text and formula recognition were supported. In OpenDoc-0.1B, we rebuilt UniRec-0.1B to enable unified recognition of text, formulas, and tables.
- Supports document parsing for Chinese and English.
- Achieves 90.57% on OmniDocBench (v1.5), outperforming many document parsing models based on multimodal large language models.
-
-
🔥UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
- [Doc]
[. Calls go through the generic message tool with channel="bluebubbles".
bear-notes
331.2kCreate, search, and manage Bear notes via grizzly CLI.
claude-ads
1.2kComprehensive paid advertising audit & optimization skill for Claude Code. 186 checks across Google, Meta, YouTube, LinkedIn, TikTok & Microsoft Ads with weighted scoring, parallel agents, and industry templates.
claude-ads
1.2kComprehensive paid advertising audit & optimization skill for Claude Code. 186 checks across Google, Meta, YouTube, LinkedIn, TikTok & Microsoft Ads with weighted scoring, parallel agents, and industry templates.
