Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Generate Convert Improve

Install / Use

/learn @axa-group/Parsr

About this skill

Quality Score

0/100

README

<p align='center'> <img src="assets/logo.png" width="275"><br /> </p> <h2 align="center"><i>Turn your documents into data!</i></h2> <p align="center"> <a href="https://cloud.drone.io/axa-group/Parsr"><img src="https://cloud.drone.io/api/badges/axa-group/Parsr/status.svg"></a> </p> <p align="center"> <a href="README_fr.md">Français</a> | <a href="README_pt.md">Portuguese</a> | <a href="README_sp.md">Spanish</a> | <a href="README_zh-cn.md">中文</a> </p>

[!WARNING] This project is no longer maintained. Security patches are not being applied. Consider using an alternative such as LiteParse if you need a local, Apache 2.0–licensed parsing solution.

Parsr, is a minimal-footprint document (image, pdf, docx, eml) cleaning, parsing and extraction toolchain which generates readily available, organized and usable data in JSON, Markdown (MD), CSV/Pandas DF or TXT formats.
It provides analysts, data scientists and developers with clean structured and label-enriched information set for ready-to-use applications ranging from data entry and document analysts automation, archival, and many others.
Currently, Parsr can perform: document cleaning, hierarchy regeneration (words, lines, paragraphs), detection of headings, tables, lists, table of contents, page numbers, headers/footers, links, and others. Check out all the features.

Table of Contents
Getting Started
- Installation
- Usage
Documentation
Contribute
Third Party Licenses
License

Getting Started

Installation

-- The advanced installation guide is available here --

The quickest way to install and run the Parsr API is through the docker image:

docker pull axarev/parsr

If you also wish to install the GUI for sending documents and visualising results:

docker pull axarev/parsr-ui-localhost

Note: Parsr can also be installed bare-metal (not via Docker containers), the procedure for which is documented in the installation guide.

Usage

-- The advanced usage guide is available here --

To run the API, issue:

docker run -p 3001:3001 axarev/parsr

which will launch it on http://localhost:3001.
Consult the documentation on the usage of the API.

To access the python client to Parsr API, issue:
```
pip install parsr-client
```
To sample the Jupyter Notebook, using the python client, head over to the jupyter demo.

To use the GUI tool (the API needs to already be running), issue:
```
docker run -t -p 8080:80 axarev/parsr-ui-localhost:latest
```
Then, access it through http://localhost:8080.

Refer to the Configuration documentation to interpret the configurable options in the GUI viewer.

The API based usage and the command line usage are documented in the advanced usage guide.

Documentation

All documentation files can be found here.

Contribute

Please refer to the contribution guidelines.

Third Party Licenses

Third Party Libraries licenses for its dependencies:

QPDF: Apache http://qpdf.sourceforge.net
ImageMagick: Apache 2.0 https://imagemagick.org/script/license.php
Pdfminer.six: MIT https://github.com/pdfminer/pdfminer.six/blob/master/LICENSE
PDF.js: Apache 2.0 https://github.com/mozilla/pdf.js
Tesseract: Apache 2.0 https://github.com/tesseract-ocr/tesseract
Camelot: MIT https://github.com/camelot-dev/camelot
MuPDF (Optional dependency): AGPL https://mupdf.com/license.html
Pandoc (Optional dependency): GPL https://github.com/jgm/pandoc

License

Related Skills

node-connect

341.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

84.5k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

84.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

Writing Hookify Rules

84.5k

This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.

axa-group

View profile

View on GitHub

GitHub Stars6.2k

CategoryDevelopment

Updated2d ago

Forks325

axa-group/Parsr

Languages

JavaScript

Security Score

100/100

Audited on Mar 28, 2026

No findings

Parsr

Install / Use

README

Table of Contents

Getting Started

Installation

Usage

Documentation

Contribute

Third Party Licenses

License

Related Skills