SkillAgentSearch skills...

NushuRescue

No description available

Install / Use

/learn @ivoryayang/NushuRescue
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

:star: NushuRescue Project

This is the repository for COLING'25 accepted paper: NushuRescue: Reviving the Endangered Nushu Language with AI [COLING paper)].

The preservation and revitalization of endangered and extinct languages is a meaningful endeavor, conserving cultural heritage while enriching fields like linguistics and anthropology. However, these languages are typically low-resource, making their reconstruction labor-intensive and costly. This challenge is exemplified by Nushu, a rare script historically used by Yao women in China for self-expression within a patriarchal society. To address this challenge, we introduce NushuRescue, an AI-driven framework designed to train large language models (LLMs) on endangered languages with minimal data. NushuRescue automates evaluation and expands target corpora to accelerate linguistic revitalization. As a foundational component, we developed NCGold, a 500-sentence Nushu-Chinese parallel corpus, the first digitized dataset of its kind. Leveraging GPT-4-Turbo, with no prior exposure to Nushu and only 35 short examples from NCGold, NushuRescue achieved 48.69% translation accuracy on 50 withheld sentences and generated NCSilver, a set of 98 newly translated modern Chinese sentences of varying lengths. A sample of both NCGold and NCSilver is included in the Supplementary Materials. Additionally, we developed FastText-based and Seq2Seq models to further support research on Nushu. NushuRescue provides a versatile and scalable tool for the revitalization of endangered languages, minimizing the need for extensive human input.

Note: The NCGold dataset was derived from carefully selected, short segments of existing Nüshu literature, transformed and annotated for academic research under fair use. Annotations are available upon request for academic use.

View on GitHub
GitHub Stars8
CategoryDevelopment
Updated2mo ago
Forks1

Languages

Jupyter Notebook

Security Score

65/100

Audited on Jan 7, 2026

No findings