Pdftotext

Simple PDF text extraction

Generate Convert Improve

Install / Use

/learn @jalan/Pdftotext

About this skill

Quality Score

0/100

README

pdftotext

Simple PDF text extraction

import pdftotext

# Load your PDF
with open("lorem_ipsum.pdf", "rb") as f:
    pdf = pdftotext.PDF(f)

# If it's password-protected
with open("secure.pdf", "rb") as f:
    pdf = pdftotext.PDF(f, "secret")

# How many pages?
print(len(pdf))

# Iterate over all the pages
for page in pdf:
    print(page)

# Read some individual pages
print(pdf[0])
print(pdf[1])

# Read all the text into one string
print("\n\n".join(pdf))

OS Dependencies

These instructions assume you're on a recent OS. Package names may differ for an older OS.

Debian, Ubuntu, and friends

sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev

Fedora, Red Hat, and friends

sudo yum install gcc-c++ pkgconfig poppler-cpp-devel python3-devel

macOS

brew install pkg-config poppler python

Windows

Currently tested only when using conda:

Install the Microsoft Visual C++ Build Tools
Install poppler through conda:
```
conda install -c conda-forge poppler
```

Install

pip install pdftotext

Related Skills

node-connect

340.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

84.2k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

84.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

340.5k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

jalan

View profile

View on GitHub

GitHub Stars1.0k

CategoryDevelopment

Updated5h ago

Forks109

jalan/pdftotext

Languages

Python

Security Score

100/100

Audited on Mar 29, 2026

No findings