338 skills found · Page 1 of 12
pymupdf / PyMuPDFPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
axa-group / Nlp.jsAn NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
tyiannak / PyAudioAnalysisPython Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Trusted-AI / Adversarial Robustness ToolboxAdversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
libAudioFlux / AudioFluxA library for audio and music analysis, feature extraction.
jdkato / Prose:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.
mit-nlp / MITIEMITIE: library and tools for information extraction
NVIDIA / NeMo RetrieverNeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
landing-ai / Agentic DocLegacy Python library for Agentic Document Extraction (ADE). Use the landingai-ade library for all new projects.
mgechev / Injection JsDependency injection library for JavaScript and TypeScript in 5.1K. It is an extraction of the Angular's ReflectiveInjector which means that it's well designed, feature complete, fast, reliable and well tested.
Lulzx / ZpdfZero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration.
xavctn / Img2tableimg2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
mrexodia / DumpulatorAn easy-to-use library for emulating memory dumps. Useful for malware analysis (config extraction, unpacking) and dynamic analysis in general (sandboxing).
ispras / DedocDedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
landing-ai / Ade PythonPython library for Agentic Document Extraction (ADE).
twitter-archive / Twitter Text RbA library that does auto linking and extraction of usernames, lists and hashtags in tweets
norman / BabosaA library for creating slugs. Babosa is an extraction and improvement of the string code from FriendlyId, intended to help developers create similar libraries or plugins.
yfedoseev / Pdf OxideThe fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.
anrieff / Libcpuida small C library for x86 CPU detection and feature extraction
CogComp / Cogcomp NlpCogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.