Arabicstemmer
Assem's Arabic Light Stemmer is a snowball-based stemming algorithm for Arabic aimed mainly to improve search.
Install / Use
/learn @assem-ch/ArabicstemmerREADME
Assem's Arabic Stemmer 
This is an algorithm for Arabic stemming written on Snowball framework language. If offers light stemming and text normalization.
@article{Chelli2018,
author = "Assem Chelli",
title = "{Assem's Arabic Stemmer}",
year = "2018",
month = "11",
url = "https://figshare.com/articles/Assem_s_Arabic_Stemmer/7295690",
doi = "10.6084/m9.figshare.7295690.v1"
}
This is a sample of results:
Word | Light Stemmer | Root-Based Stemmer ------------ | ------------- | ------------ طفل | طفل | طفل اطفال | اطفال | طفل الاطفال | اطفال | طفل اطفالكم | اطفال | طفل فأطفالكم | اطفال | طفل اطفالهم | اطفال | طفل والاطفال | اطفال| طفل فاطفالهم | اطفال | طفل وطفل | طفل | طفل الطفولة | طفول | طفل والطفلتين | طفل |طفل طفلتان | طفل | طفل
Requirements:
They are already attached as git submodules so just run:
$ git submodule update --init --recursive
Build:
$ make build
Run:
- Light Stemmer
$ make run
الطالب
طالب
- Root-Based Stemmer
$ make run_root
الطالب
طلب
Test:
We configured tests to run against snowball-data arabic sample to test speed, grouping factor and precision.
$ make test
Distributions:
- dist light stemmer to available languages:
$ make dist
