Grobidmonkey
The grobidmonkey package is an open-source package designed for postprocessing GROBID outputs.
Install / Use
/learn @com3dian/GrobidmonkeyREADME
The grobidmonkey package is an open-source package designed for postprocessing GROBID outputs.
-
Website: https://github.com/com3dian/Grobidmonkey
-
Documentation: https://github.com/com3dian/Grobidmonkey/tree/master/Document
-
Source code: https://github.com/com3dian/Grobidmonkey/tree/master/src/grobidmonkey
-
Bug reports: https://github.com/com3dian/Grobidmonkey/issues
-
Citing in your work: https://studenttheses.uu.nl/handle/20.500.12932/45939 or
@mastersthesis{lu2024unsupervised,
title={Unsupervised Paper2Slides Generation},
author={Lu, Zehao},
year={2024}
}
grobidmonkey is a light weight python package built to handle TEI XML files generated by GROBID. It provides a reader class that converts these files into Python dictionaries, making them simple to read and work with. The grobidmonkey reader is capable of reading the entire essay as a dictionary, where each key represents section titles and the corresponding values are lists of section contents in paragraphs. Also the reader provides a method for reading the outline of essay as a tree.
Installation
Currently grobidmonkey is only available in PyPI, and can be installed with
pip install grobidmonkey
Quick Start
from grobidmonkey import reader
monkeyReader = reader.MonkeyReader('monkey') # or 'lxml' or 'x2d'
# read paper outline
outline = monkeyReader.readOutline('/path/to/your/paper.pdf.tei.xml')
# read paper content
essay = monkeyReader.readEssay('/path/to/your/paper.pdf.tei.xml')
For detailed explanantion and tutorial, please check the Document page.
Contirbution
We welcome all contributions, whether they involve code, documentation, or testing, feel free to reach out to me via email at com3dian@outlook.com.
Icon
Gorbidmonkey's icon is a walking monkey.
$$
$$$$$$$$$$$$$$$$$$ $$$$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$
$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$ $$$$$$$$$$$$
$$$$$$
$$$$$$
$$$$$$
$$$$$$ GROBIDMONKEY
$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$
$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$ $$$$$$$$ $$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$ $$ $
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$ $$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$ $$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$ $$$$$$$$$$$$$$$$$
$$$$$ $$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$ $$$$
$$$$$$$$$$$ $$$$$$$$$$$$ $$$$$$$$$$$$$$$$ $$$$$$$
$$$$$$$$$$$$$$$$ $$$$$$$$$$$$ $$$$$$$$$$$$$$ $$$$$$$$$$
$$$$$$$$$$$$$$$$$ $$$$$$$$$$$ $$$$$$$$$$$$$ $$$$$$$$$$$
$$$$$$$$$$$ $$$$$$$$$$ $$$$$$$$$$ $$$$$$$$$$$
$$$$$$$ $$$$$$$$$ $$$$$$$$ $$$$$$$$$$
$$$$$ $$$$$$$$$ $$$$$$$ $$$$$$$$$
$$$$$$ $$$$$$$$ $$$$$$$$ $$$$$$$$$
$$$$$$ $$$$$$$$ $$$$$$$$ $$$$$$$$$
$$$$$$ $$$$$$$$ $$$$$$$$$ $$$$$$$$
$$$$$ $$$$$$$$$ $$$$$$$$$ $$$$$$$$
$$$$$$$$$$ $$$$$$$$$ $$$$$$$$
$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$
$$$$$$$$$$$ $$$$$$$$$$$$$$
About GROBID
GROBID means GeneRation Of BIbliographic Data.
GROBID is a machine learning library for extracting, parsing and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications.
You can also try the GROBID web app with your paper.
Related Skills
diffs
344.4kUse the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
2.0kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
ui-ux-pro-max-skill
56.5kAn AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
