Aiopytesseract
A Python asyncio wrapper for Tesseract-OCR.
Install / Use
/learn @amenezes/AiopytesseractREADME
aiopytesseract
A Python asyncio wrapper for Tesseract-OCR.
Installation
Install and update using pip:
pip install aiopytesseract
Usage
List all available languages by Tesseract installation
import aiopytesseract
await aiopytesseract.languages()
await aiopytesseract.get_languages()
Tesseract version
import aiopytesseract
await aiopytesseract.tesseract_version()
await aiopytesseract.get_tesseract_version()
Tesseract parameters
import aiopytesseract
await aiopytesseract.tesseract_parameters()
Confidence only info
import aiopytesseract
await aiopytesseract.confidence("tests/samples/file-sample_150kB.png")
Deskew info
import aiopytesseract
await aiopytesseract.deskew("tests/samples/file-sample_150kB.png")
Extract text from an image: locally or bytes
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_string("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_string(
Path("tests/samples/file-sample_150kB.png").read_bytes(), dpi=220, lang='eng+por'
)
Box estimates
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_boxes("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_boxes(Path("tests/samples/file-sample_150kB.png")
Boxes, confidence and page numbers
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_data("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_data(Path("tests/samples/file-sample_150kB.png")
Information about orientation and script detection
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_osd("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_osd(Path("tests/samples/file-sample_150kB.png")
Generate a searchable PDF
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_pdf("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_pdf(Path("tests/samples/file-sample_150kB.png")
Generate HOCR output
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_hocr("tests/samples/file-sample_150kB.png")
await aiopytesseract.image_to_hocr(Path("tests/samples/file-sample_150kB.png")
Multi ouput
from pathlib import Path
import aiopytesseract
async with aiopytesseract.run(
Path('tests/samples/file-sample_150kB.png').read_bytes(),
'output',
'alto tsv txt'
) as resp:
# will generate (output.xml, output.tsv and output.txt)
print(resp)
alto_file, tsv_file, txt_file = resp
Config variables
from pathlib import Path
import aiopytesseract
async with aiopytesseract.run(
Path('tests/samples/text-with-chars-and-numbers.png').read_bytes(),
'output',
'alto tsv txt'
config=[("tessedit_char_whitelist", "0123456789")]
) as resp:
# will generate (output.xml, output.tsv and output.txt)
print(resp)
alto_file, tsv_file, txt_file = resp
from pathlib import Path
import aiopytesseract
await aiopytesseract.image_to_string(
"tests/samples/text-with-chars-and-numbers.png",
config=[("tessedit_char_whitelist", "0123456789")]
)
await aiopytesseract.image_to_string(
Path("tests/samples/text-with-chars-and-numbers.png").read_bytes(),
dpi=220,
lang='eng+por',
config=[("tessedit_char_whitelist", "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")]
)
For more details on Tesseract best practices and the aiopytesseract, see the folder:
docs.
Examples
If you want to test aiopytesseract easily, can you use some options like:
- docker/docker-compose
- streamlit
Docker / docker-compose
After clone this repo run the command below:
docker-compose up -d
streamlit app
For this option it's necessary first install aiopytesseract and streamlit, after execute:
# remote option:
streamlit run https://github.com/amenezes/aiopytesseract/blob/master/examples/streamlit/app.py
# local option:
streamlit run examples/streamlit/app.py
note: The streamlit example need python >= 3.13
Links
- License: Apache License
- Code: https://github.com/amenezes/aiopytesseract
- Issue tracker: https://github.com/amenezes/aiopytesseract/issues
- Docs: https://github.com/amenezes/aiopytesseract
Related Skills
node-connect
340.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.2kCommit, push, and open a PR
