Results for "stopword-removal"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

21 skills found

vngrs-ai / Vnlp

286

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

universal

deasciifierdeep-learningdependency-parsing+17

Updated 17h ago

liulalemx / Felig Toolkit

A toolset for Amharic Language pre-processing. Includes an Amharic Stemmer, Transliterator, Stopword remover , Lexical analyzer, Corpus indexer and Term weighter.

universal

amharicamharic-corpusamharic-nlp+6

Updated 8mo ago

fergiemcdowall / Term Vector

A node.js module that creates a term vector from a mixed text input. Supports stopword removal and customisable separators.

universal

Updated 26d ago

ABHISHEKVALSAN / Malayalam Newspaper Article Dataset

The project scraps articles from a malayalam newspaper website to create a corpus. A set of queries is created and corresponding ground truth answers is retrieved. This can be used as a dataset that can check new tools in future like malaylam stemmer, stopwords removal, lemmatizers, etc...

universal

Updated 11mo ago

juanantoniodelgado / StopWords

PHP StopWords removal library with support for multiple languages.

universal

composerphp7stopwords+2

Updated 1mo ago

Abinaya-Krishnan / BM25 Model Python Implementation

Bm25 Information retrieval model using python language

universal

bm25information-filteringinformation-retrieval+2

Updated 5y ago

sergio11 / Spam Email Classifier Lstm

This project uses a Bi-directional LSTM model 📧🤖 to classify emails as spam or legitimate, utilizing NLP techniques like tokenization, padding, and stopword removal. It aims to create an effective email classifier 💻📊 while addressing overfitting with strategies like early stopping 🚫.

universal

bilstmconfusion-matrixdata-preprocessing+10

Updated 1mo ago

bryanchw / Traditional Chinese Stopwords And Punctuations Library

Created a Python library specifically for Traditional Chinese stopwords and punctuations removal

universal

cantonesenlppunctuation+3

Updated 3mo ago

rhnfzl / SqueakyCleanText

Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble, language detection, stopword removal. Built for statistical ML and language models.

universal

anonymizationglinermachine-learning+15

Updated 14d ago

ahirtonlopes / Text Mining

Basic Text Mining and NLP operations such as Tokenization, Portuguese POS Tagging, Stopword Removal among others.

universal

Updated 6mo ago

afadel151 / Document Indexer

this is an open-source document indexing and retrieval system written from scratch in Java. It implements core Information Retrieval (IR) techniques including tokenization, stopword removal, stemming, TF-IDF weighting, and BM25 ranking

universal

Updated 5mo ago

prigarg / Naive Bayes Algorithm From Scratch For Text Classification

Naïve Bayes Algorithm is implemented from scratch in order to classify spam and not spam emails.

universal

email-classificationfeature-selectionnaivebayesclassifier+2

Updated 1y ago

machinelearningprodigy / Sentiment Analysis

The Twitter Sentiment Analysis app predicts whether a tweet has a Positive 😊 or Negative 😞 sentiment using Logistic Regression and Naive Bayes models. It preprocesses text with stemming and stopword removal for better accuracy and provides color-coded visual feedback for easy interpretation.

universal

ainotebook-jupyterpip+5

Updated 1mo ago

Salma0-8 / Sentiment Insights Analyzing ChatGPT User Review

I employed NLP techniques to evaluate user feedback on ChatGPT, utilizing Python libraries like VADER for sentiment analysis to categorize reviews into positive, neutral, and negative sentiments. Implemented data preprocessing techniques such as tokenization and stopword removal, visualizing results with Plotly to yield actionable insights.

universal

Updated 6mo ago

akshayaram95 / Near Real Time Road Traffic Event Detection Using Twitter And Spark.

Gather tweets using twitter search API, pre-process tweets and extract important features to build a model using spark MLlib. Stream tweets using twitter streaming API and push data into kafka topic using a kafka producer after applying partial filters. Read from kafka topic using kafka consumer. Perform tokenization, stopword removal etc. to pre-process the data. Extract machine readable features using bag of words approach and predict instances with the model. Tweets are indexed to elasticsearch after classification. Constructed a traffic heat map by reading the coordinates data from elasticsearch.

universal

Updated 1y ago

astuanax / Stopwords

Stopwords removal:

universal

Updated 7mo ago

atahanuz / Turkish Text Preprocessing

A web application for Turkish text preprocessing including tokenization, stemming, normalization, and stopword removal.

universal

Updated 2mo ago

sarwaralamsb / Text To Keyword Extraction

A Python web app that extracts keywords from text using TF-IDF and NLP, with adjustable keyword count and stopword removal.

universal

Updated 1y ago

Safae26 / Bag Of Words

A complete Bag of Words pipeline built with Python, NLTK, and spaCy. It demonstrates text preprocessing (tokenization, lowercasing, stopword removal, lemmatization) and converts text into numerical vectors using word frequency counts. Perfect for understanding fundamental NLP vectorization techniques.

universal

Updated 2mo ago

Manavarya09 / Finance Tracker

built a machine learning model to classify news articles as real or fake using NLP techniques like tokenization, stopword removal, and TF-IDF. Trained Naïve Bayes and Logistic Regression models, achieving X% accuracy. Analyzed linguistic patterns to differentiate fake and real new

zed

Updated 5mo ago