402 skills found · Page 1 of 14
first20hours / Google 10000 EnglishThis repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.
tdebatty / Java String SimilarityImplementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
zhezhaoa / Ngram2vecFour word embedding models implemented in Python. Supporting arbitrary context features
sinovation / ZENA BERT-based Chinese Text Encoder Enhanced by N-gram Representations
rockymadden / Stringmetric:dart: String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein).
proycon / PynlplPyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
adrg / StrutilGo metrics for calculating string similarity and other string utility functions
FGRibreau / Node Language Detect🇫🇷 NodeJS language detection library using n-gram
Pomax / NrGrammarThe Nihongo Resources grammar book: "An Introduction to Japanese; Syntax, Grammar & Language"
ranelpadon / Ngram TypeTouch typing trainer using N-grams as data source, with options to customize the auto-generated lessons and specify the minimum typing performance needed. There are sound/color effects as well.
dalinvip / Cw2veccw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information
skrtdev / NovaGramAn Object-Oriented PHP library for Telegram Bots
jctian98 / E2e LfmmiE2E system with LF-MMI; word N-gram for Mandarin
brymer-meneses / Grammar Guard.nvimGrammar Guard is a Neovim plugin that checks your grammar as you write your LaTeX, Markdown or plain text document.
mongoid / Mongoid FulltextAn n-gram-based full-text search implementation for the Mongoid ODM.
zedom1 / Error DetectionCode for chinese error detection module, using n-gram and bi-lstm
andreekeberg / Ml Classify Text JsMachine learning based text classification in JavaScript using n-grams and cosine similarity
proycon / Colibri CoreColibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
jwieting / CharagramCode to train and use models from "Charagram: Embedding Words and Sentences via Character n-grams".
feedbackmine / Language Detectorruby language detection library using n-gram