Sagartst
Dataset of Sino-Tibetan Languages (Cognate-Coded)
Install / Use
/learn @lexibank/SagartstREADME
CLDF dataset derived from Sagart et al.'s "Sino-Tibetan Database of Lexical Cognates" from 2019
How to cite
If you use these data please cite
- the original source
Laurent Sagart, Jacques, Guillaume, Yunfan Lai, and Johann-Mattis List (2019): Sino-Tibetan Database of Lexical Cognates. Jena: Max Planck Institute for the Science of Human History.
- the derived dataset using the DOI of the particular released version you were using
Description
This dataset is licensed under a CC-BY-4.0 license
Available online at http://dighl.github.io/sinotibetan/
Conceptlists in Concepticon:
Notes
Statistics
- Varieties: 50 (linked to 48 different Glottocodes)
- Concepts: 250 (linked to 250 different Concepticon concept sets)
- Lexemes: 12,179
- Sources: 25
- Synonymy: 1.06
- Cognacy: 12,179 cognates in 5,120 cognate sets (3,468 singletons)
- Cognate Diversity: 0.41
- Invalid lexemes: 0
- Tokens: 60,455
- Segments: 459 (0 BIPA errors, 0 CLTS sound class errors, 454 CLTS modified)
- Inventory size (avg): 51.26
Contributors
Name | GitHub user | Description | Role --- | --- | --- | --- Laurent Sagart | | cognate coding | Author Guillaume Jacques | | cognate coding | Author Yunfan Lai | | data managment | Author Johann-Mattis List | @LinguList | maintainer | Author, Editor
CLDF Datasets
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json
