21 skills found
adobe / NLP CubeNatural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
erre-quadro / SpikexSpikeX - SpaCy Pipes for Knowledge Extraction
vngrs-ai / VnlpState-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.
mediacloud / Sentence SplitterText to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
vanderlee / Php SentenceSimple text sentence splitting and counting. Supports atleast english, german and dutch, possibly more. If you find it works well enough for your language, please let me know!
hotchpotch / Fast Bunkai⚡Japanese sentence splitting(日本語文境界判定器), 40–250× faster via a Rust-accelerated Python library with near-perfect API compatibility with megagonlabs/bunkai.
sentencizer / SentencizerA sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.
Prismadic / Magnetthe small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly
zaemyung / SentsplitA flexible sentence segmentation library using CRF model and regex rules
turian / PytextpreprocessPreprocess text for NLP (tokenizing, lowercasing, stemming, sentence splitting, etc.)
clipperhouse / Uax29.netA tokenizer for splitting words, graphemes and sentences per Unicode UAX #29, for .Net
maxazure / Video Editing SkillOpenClaw Skill: Auto video editing for talk/vlog videos — speech recognition, sentence splitting, subtitle burning, and clip merging
petermr / DocanalysisSemantic analysis of text documents including sentence and paragraph splitting
Lambda-3 / MinWikiSplitSentence Splitting Dataset
rrtucci / SentenceAxcomplete rewrite of sentence splitting program Openie6
yojkim / YKHangulSplitting your Korean sentence to each character likes 'Initial', 'Medial', 'Final'.
crowegian / Medical Sentence TokenizerSome of my work on splitting medical text into sentences for BERT langauge modeling training.
mfaruqui / Multilingual Sentence SplitterMultilingual Sentence Splitting Tool
KorAP / DatokHigh-Performance Finite State Tokenizer
eliorsulem / HSplit CorpusGold-Standard Sentence Splitting Corpus