CLDF dataset derived from Sagart et al.'s "Sino-Tibetan Database of Lexical Cognates" from 2019

How to cite

If you use these data please cite

the original source

Laurent Sagart, Jacques, Guillaume, Yunfan Lai, and Johann-Mattis List (2019): Sino-Tibetan Database of Lexical Cognates. Jena: Max Planck Institute for the Science of Human History.
the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at http://dighl.github.io/sinotibetan/

Conceptlists in Concepticon:

Sagart-2019-250

Notes

Statistics

Varieties: 50 (linked to 48 different Glottocodes)
Concepts: 250 (linked to 250 different Concepticon concept sets)
Lexemes: 12,179
Sources: 25
Synonymy: 1.06
Cognacy: 12,179 cognates in 5,120 cognate sets (3,468 singletons)
Cognate Diversity: 0.41
Invalid lexemes: 0
Tokens: 60,455
Segments: 459 (0 BIPA errors, 0 CLTS sound class errors, 454 CLTS modified)
Inventory size (avg): 51.26

Contributors

Name | GitHub user | Description | Role --- | --- | --- | --- Laurent Sagart | | cognate coding | Author Guillaume Jacques | | cognate coding | Author Yunfan Lai | | data managment | Author Johann-Mattis List | @LinguList | maintainer | Author, Editor

CLDF Datasets

The following CLDF datasets are available in cldf:

CLDF Wordlist at cldf/cldf-metadata.json

Sagartst

Install / Use

README