Lexd
A lexicon compiler for non-suffixational morphologies
Install / Use
/learn @apertium/LexdREADME
Lexd
A lexicon compiler specialising in non-suffixational morphologies.
This module compiles lexicons in a format loosely based on hfst-lexc and produces transducers in ATT format which are equivalent to those produced using the overgenerate-and-constrain approach with hfst-twolc (see here and here). However, it is much faster (see below).
See Usage.md for the rule file syntax.
Installation
First, clone this repository.
To build, do
./autogen.sh
make
make install
If installing to a system-wide path, you may want to run sudo make install instead for the last step.
To compile a lexicon file into a transducer, do
lexd lexicon_file att_file
To get a speed comparison, do
make timing-test
To run basic feature smoke-tests (fast), do
make check
Why is it faster?
When dealing with prefixes, the overgenerate-and-constrain approach initially builds a transducer like this:

Then composes that with a twolc rule to turn it into somehting like this:

But compiling the rule needed to do that can take hundreds of times longer than compiling the lexicon.
Lexd, meanwhile, makes multiple copies of the lexical portion and attaches one to each prefix, thus generating the second transducer directly in a similar amount of time to what is required to generate the first one.
| Language | Wamesa | Hebrew | Navajo | Lingala | |---|---:|---:|---:|---:| | Stems | 262 | 127 | 19 | 1470 | Total forms | 12576 | 2540 | 473 | 1649496 | Path restrictions | 14 | 10 | 17 | 19 | Lexc + Twolc | Lexc compilation | 25ms | 15ms | 25ms | 230ms | Twolc compilation | 10245ms | 1360ms | 8460ms | 275525ms | Rule composition | 2050ms | 225ms | 1705ms | 45550ms | Minimization | 65ms | 5ms | 20ms | 155ms | | Total time | 12385ms | 1605ms | 10210ms | 321460ms | | Lexd | Lexd compilation | 210ms | 85ms | 10ms | 490ms | | Format conversion | 30ms | 5ms | 5ms | 55ms | | Total time | 240ms | 90ms | 15ms | 545ms | | Speedup | 52x | 18x | 681x | 590x |
Related Skills
node-connect
352.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
