PDFtoTXT
Python code to read text from a PDF file (OCR).
Install / Use
/learn @lucab85/PDFtoTXTREADME
PDF to TXT
Python code to do OCR recognition of a PDF file and export text to TXT file.
- LocalOCR: based on Tesseract OCR
- CloudOCR: based on Google Vision API
Setup for LocalOCR on Ubuntu
apt-get install python-pyocr python-wand imagemagick
apt-get install libleptonica-dev tesseract-ocr-dev
apt-get install tesseract-ocr-ita
pip install -r requirements.txt
Setup CloudOCR on Ubuntu
Install Google Cloud SDK
apt-get install pdfimages google-cloud-sdk-app-engine-python
pip install requirements.txt
Related Skills
node-connect
339.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
83.8kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
83.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
339.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
