418 skills found · Page 1 of 14
jsvine / PdfplumberPlumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
py-pdf / PypdfA pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
QuivrHQ / MegaParseFile Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
opendataloader-project / Opendataloader PdfPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
euske / PdfminerPython PDF Parser (Not actively maintained). Check out pdfminer.six.
CosmosShadow / GptpdfUsing GPT to parse PDF
CatchTheTornado / Text Extract ApiDocument (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
vsch / Flexmark JavaCommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
chatdoc-com / OCRFluxOCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page content merging.
yob / Pdf ReaderThe PDF::Reader library implements a PDF parser conforming as much as possible to the PDF specification from Adobe.
LibPDF-js / CoreA modern PDF library for TypeScript. Parse, modify, and generate PDFs with a clean, intuitive API.
wisupai / E2mE2M converts various file types (doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, m4a) into Markdown. It’s easy to install, with dedicated parsers and converters, supporting custom configs. E2M offers an all-in-one, flexible, and open-source solution.
galkahana / HummusJSNode.js module for high performance creation, modification and parsing of PDF files and streams
galkahana / PDF WriterHigh performance library for creating, modiyfing and parsing PDF files in C++
adithya-s-k / Marker ApiEasily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
twwch / JadeAIAI-Powered Smart Resume Builder — 50+ professional templates, PDF/image parsing, AI optimization, JD match analysis, multi-format export. Open source & free, one-click Docker deployment.
Lulzx / ZpdfZero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration.
Skythinker616 / Gpt Assistant Android【新增PDF和Office文件解析上传】安卓端全场景GPT助手,可用音量键唤起并进行语音交流,支持联网、拍照、模板、PDF和Office文件解析等 | GPT assistant for Android, activated via volume keys for voice interaction, supporting features such as networking, taking photos, templates and parsing PDF and Office documents.
drmingler / Docling ApiEasily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is Ideal for large-scale workflows, it offers text/table extraction, OCR, and batch processing with sync/async endpoints.
flyhunterl / Flymd高性能Markdown笔记工具!免费AI,智能便签、TODO推送、本地知识库、AI小说引擎。PDF解析、自动语音笔记、录音转文本。毫秒级启动High-performance Markdown note tool! Free AI, smart notes, TODO reminders, local knowledge base, AI novel engine. PDF parsing, auto voice notes, audio-to-text. Millisecond startup.