SkillAgentSearch skills...

Kadot

Natural language processing using unsupervised vectors representation.

Install / Use

/learn @loristns/Kadot
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="https://github.com/the-new-sky/Kadot/raw/1.0dev/logo.png" alt="Kadot" height="100px"/> </p>

Natural language processing using unsupervised vectors representation.

Documentation Status Codacy Badge

⚠️ Kadot is no longer in development, the project had two branches: 0.x and 1.x (this one).

Kadot is a high-level open-source library to easily process text documents. It relies on vector representations of documents or words in order to solve NLP tasks such as summarization, spellchecking or classification.

# How to get n-grams using kadot.
>>> from kadot.tokenizers import regex_tokenizer
>>> hello_tokens = regex_tokenizer("Kadot just lets you process a text easily.")
>>> hello_tokens.ngrams(n=2)

[('Kadot', 'just'), ('just', 'lets'), ('lets', 'you'), ('you', 'process'), ('process', 'a'), ('a', 'text'), ('text', 'easily')]

What's 🆕 in 1.0 ?

⚠️ All these new features may not yet be available on Github.

  • Vectorizers : We are now offering Word2Vec, the state-of-the-art Fasttext and Doc2Vec algorithms using Gensim's powerful backend.
  • Performances : Using a much more efficient algorithm, the new word vectorizer is up to 95% faster and sparse vectors now take up to 94% less memory.
  • Models : Kadot now includes a text classifier, an automatic text summarizer and an entity labeler which can be useful in many projects.
  • Bot Engine : Soon
  • Dependencies 😞 : In order to guarantee good performance without reinventing the wheel, we are adding Gensim and Pytorch to our list of dependencies. Although installed by default, these libraries will be optional and only Numpy and Scipy are strictly required to use Kadot.

⚖️ License

Kadot is under MIT license.

🚀 Contribute

Issues and pull requests are gratefully welcome. Come help me !

I am not a native English speaker, if you see any language mistakes in this README or in the code (docstrings included), please open an issue.

View on GitHub
GitHub Stars105
CategoryDevelopment
Updated7mo ago
Forks9

Languages

Jupyter Notebook

Security Score

77/100

Audited on Sep 5, 2025

No findings