SkillAgentSearch skills...

Miknaaz

Generate arabic golden standard corpus for morphology and stemming

Install / Use

/learn @linuxscout/Miknaaz
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Miknaaz مكناز

Description

Generate Arabic golden standard corpus for morphology and stemming

Citation

If you would cite it in academic work, can you use this citation

Taha Zerrouki‏, Miknaaz,  http://github.com/linuxscout/miknaaz, 2023

or in bibtex format

@misc{zerrouki2018miknaaz,
  title={Miknaaz: Generate arabic golden standard},
  author={Zerrouki, Taha},
  url={http://github.com/linuxscout/miknaaz},
  year={2018}
}

Usage

  • Build word features for linguistics building corpus
from miknaaz.corpus_builder import CorpusBuilder
text = u"إلى البيت"
lemmer = CorpusBuilder()
words = lemmer.tokenize(text)
for word in words:
    result = lemmer.morph_suggestions(word, True)
    print(result)
  • Extract separate features

    from miknaaz.corpus_builder import CorpusBuilder
    text = u"إلى البيت"
    lemmer = CorpusBuilder()
    words = lemmer.tokenize(text)
    # test get lemmas
    for word in words:
        result = lemmer.get_lemmas(word)
        # the result contains objects
        print(result)
    # test get roots
    for word in words:
        result = lemmer.get_roots(word)
        # the result contains objects
        print(result)
    # test get wordtypes
    for word in words:
        result = lemmer.get_word_type(word)
        # the result contains objects
        print(result)
    # test get wazns
    for word in words:
        result = lemmer.get_wazns(word)
        # the result contains objects
        print(result)
    
View on GitHub
GitHub Stars12
CategoryDevelopment
Updated3y ago
Forks2

Languages

Python

Security Score

75/100

Audited on Jan 13, 2023

No findings