323 skills found · Page 1 of 11
PaddlePaddle / PaddleOCRTurn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
run-llama / Llama IndexLlamaIndex is the leading document agent and OCR platform
getomni-ai / ZeroxOCR & Document Extraction using vision models
clovaai / DonutOfficial Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
mindee / DoctrdocTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
run-llama / LiteparseA fast, helpful, and open-source document parser
ocropus-archive / DUP OcropyPython-based tools for document analysis and OCR
CatchTheTornado / Text Extract ApiDocument (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Nutlope / Llama OcrDocument to Markdown OCR library with Llama 3.2 vision
WZBSocialScienceCenter / PdftabextractA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
icereed / Paperless GptUse LLMs and LLM Vision (OCR) to handle paperless-ngx - Document Digitalization powered by AI
tjmlabs / ColiVaraColivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has state of the art retrieval performance on both text and visual documents. using vision models instead of chunking and text-processing for documents. No OCR, no text extraction, no broken tables, or missing images.
NanoNets / DocstrangeExtract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.
Topdu / OpenOCROpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful reproductions of the core implementations from a wide range of academic papers.
opensemanticsearch / Open Semantic SearchOpen Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
scribeocr / ScribeocrWeb interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
drmingler / Docling ApiEasily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
kreuzberg-dev / Html To MarkdownHigh performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 56+ document formats using streaming parsers and built-in OCR.
fufankeji / DeepSeek OCR WebOut-of-the-box DeepSeek OCR document parsing Web Studio
opendatalab / MinerU DiffusionA diffusion-based framework for document OCR that replaces autoregressive decoding with block-level parallel diffusion decoding. Topics