67 skills found · Page 1 of 3
wenyan-lang / Wenyan文言文編程語言 A programming language for the ancient Chinese.
pariskang / CMLM ZhongJing首个中医大语言模型——“仲景”。受古代中医学巨匠张仲景深邃智慧启迪,专为传统中医领域打造的预训练大语言模型。 The first-ever Traditional Chinese Medicine large language model - "CMLM-ZhongJing". Inspired by the profound wisdom of the ancient Chinese medical master Zhang Zhongjing, it is a pre-trained large language model designed specifically for the field of Traditional Chinese Medicine.
NoorBayan / DiwanDiwan is the largest Arabic poetry dataset, containing nearly 500,000 poems and over 15 million verses. It spans various poetic forms, meters, styles, and themes, from ancient to modern times. The dataset is organized into categories to support research in Arabic literature, prosody, and natural language processing.
OliverHellwig / SanskritData for the quantitative study of (Vedic) Sanskrit
ambuda-org / AmbudaMain application code for Ambuda, a breakthrough Sanskrit library (ambuda.org)
goru001 / Nlp For SanskritState of the Art Language models and Classifier for Sanskrit language (ancient indian language)
googleartsculture / WorkbenchFabricius is a collaborative project between Google Arts & Culture, the Australian Centre of Egyptology at Macquarie University and Ubisoft. Together we developed the first digital tool that gives experts a fast way to decode the ancient Egyptian language (Hieroglyphs) and everyone else an easy way to learn about and even write in this ancient language.
Pzoom522 / HistSummCode and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
ttzHome / AnchiBERTAnchiBERT: A Pre-Trained Model for Ancient Chinese Language Understanding and Generation(古文预训练模型)
back-kh / SADA Ancient Palm Leaf Manuscripts Recognitions[PRL 2025, APSIPA 2022] Syllable Analysis Data Augmentation (SADA), This project introduces a glyph dictionary and grammar-aware augmentation strategy designed to enhance Khmer palm leaf manuscript recognition. By modeling the language's grammatical structure, we support more robust OCR performance in low-resource settings.
Electronic-Old-Persian-Library / Old Persian DatasetRaw dataset for Old Persian cuneiform
GoThereGit / EvaHanEvaluation of Natural Language Processing (NLP) tools for the Ancient Chinese language
praeclarum / CuneiformTranslatorsNeural network trained to translate from ancient languages to modern languages
proiel / Proiel TreebankOfficial releases of the PROIEL treebank of ancient Indo-European languages
pranaydeeps / Ancient Greek BERTPre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"
CIRCSE / LT4HALAWorkshop on Language Technologies for Historical and Ancient Languages (LT4HALA)
jmyerston / GreCyAncient Greek language models for spaCy
isen-zhang / ACLUEOfficial github repo for ACLUE, an evaluation benchmark focused on ancient Chinese language comprehension
ancientml / Ml For Ancient LanguagesMachine Learning for Ancient Languages
mwenge / Lineara.xyzA tool for exploring the Linear A corpus