Results for "extraction-library"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

338 skills found · Page 1 of 12

pymupdf / PyMuPDF

9.3k

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

universal

data-scienceepubextract-data+12

Updated 4h ago

axa-group / Nlp.js

6.6k

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more

universal

botbotschatbot+10

Updated 2d ago

tyiannak / PyAudioAnalysis

6.2k

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

universal

audioaudio-analysis-tasksaudio-data+4

Updated 1d ago

Trusted-AI / Adversarial Robustness Toolbox

5.9k

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

universal

adversarial-attacksadversarial-examplesadversarial-machine-learning+14

Updated 18h ago

libAudioFlux / AudioFlux

3.3k

A library for audio and music analysis, feature extraction.

universal

audioaudio-analysisaudio-features+16

Updated 1d ago

jdkato / Prose

3.1k

:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

universal

natural-language-processingnlpprose

Updated 8d ago

mit-nlp / MITIE

3.0k

MITIE: library and tools for information extraction

universal

c-plus-plusinformation-extractionjava+3

Updated 5d ago

NVIDIA / NeMo Retriever

2.9k

NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.

zed

Updated 13h ago

landing-ai / Agentic Doc

2.4k

Legacy Python library for Agentic Document Extraction (ADE). Use the landingai-ade library for all new projects.

universal

Updated 2h ago

mgechev / Injection Js

1.4k

Dependency injection library for JavaScript and TypeScript in 5.1K. It is an extraction of the Angular's ReflectiveInjector which means that it's well designed, feature complete, fast, reliable and well tested.

universal

decoratorsdependency-injectionjavascript+2

Updated 18d ago

Lulzx / Zpdf

891

Zero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration.

universal

high-performanceparserpdf+5

Updated 23h ago

xavctn / Img2table

860

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing

universal

image-processingopencvpython+1

Updated 11d ago

mrexodia / Dumpulator

859

An easy-to-use library for emulating memory dumps. Useful for malware analysis (config extraction, unpacking) and dynamic analysis in general (sandboxing).

universal

cross-platformdebugging-toolseasy-to-use+16

Updated 4h ago

ispras / Dedoc

654

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

universal

docdocument-analysisdocument-content-extraction+15

Updated 4h ago

landing-ai / Ade Python

629

Python library for Agentic Document Extraction (ADE).

universal

Updated 1d ago

twitter-archive / Twitter Text Rb

607

A library that does auto linking and extraction of usernames, lists and hashtags in tweets

universal

Updated 1mo ago

norman / Babosa

536

A library for creating slugs. Babosa is an extraction and improvement of the string code from FriendlyId, intended to help developers create similar libraries or plugins.

universal

Updated 1mo ago

yfedoseev / Pdf Oxide

520

The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.

universal

data-extractiondocument-processingfast+15

Updated 2h ago

anrieff / Libcpuid

508

a small C library for x86 CPU detection and feature extraction

universal

Updated 29d ago

CogComp / Cogcomp Nlp

479

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.

universal

big-datacogcompdata-mining+15

Updated 22d ago