SkillAgentSearch skills...

Trigrams

Trigram files for 500+ languages

Install / Use

/learn @wooorm/Trigrams
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

trigrams

[![Build][badge-build-image]][badge-build-url] [![Coverage][badge-coverage-image]][badge-coverage-url] [![Downloads][badge-downloads-image]][badge-downloads-url]

Trigrams for 500+ languages.

Contents

What is this?

This package exposes all trigrams for natural languages. Based on the most translated copyright-free document on this planet: UDHR.

When should I use this?

When you are dealing with natural language detection.

Install

This package is [ESM only][github-gist-esm]. In Node.js (version 18+), install with [npm][npmjs-install]:

npm install trigrams

In Deno with [esm.sh][esmsh]:

import {min, top} from 'https://esm.sh/trigrams@6'

In browsers with [esm.sh][esmsh]:

<script type="module">
  import {min, top} from 'https://esm.sh/trigrams@6?bundle'
</script>

Use

import {min, top} from 'trigrams'

console.log((await min()).nld)
console.log((await top()).pam)

Yields:

[ // 300 top trigrams.
  ' ar',
  'eer',
  'tij',
  // …
  'de ',
  'an ',
  'en ' // Most common trigram.
]
{ // 300 top trigrams.
  'isa': 6,
  'upa': 6,
  'i k': 6,
  // …
  'ang': 273,
  'ing': 282,
  'ng ': 572 // Most common trigram with how often it was found.
}

API

This package exports the identifiers [min][api-min] and [top][api-top]. It exports no [TypeScript][] types. There is no default export.

min()

Get top trigrams.

Returns

Returns a promise resolving to arrays containing the top 300 trigrams sorted from least occurring to most occurring (Promise<Record<string, Array<string>>>).

top()

Get top trigrams to occurrence counts.

Returns

Returns a promise resolving to an object mapping [UDHR in Unicode][efele-udhr] codes to objects mapping the top 300 trigrams to occurrence counts (Promise<Record<string, Record<string, number>>>).

Data

The trigrams are based on the [unicode][efele-udhr] versions of the [universal declaration of human rights][ohchr-udhr].

The files are created from all paragraphs made available by [wooorm/udhr][github-wooorm-udhr] and do not include headings and such.

Before creating trigrams,

  • the unicode characters from \u0021 to \u0040 (both including) are removed
  • one or more white space characters (\s+) are replaced with a single space
  • alphabetic characters are lower cased ([A-Z])

Additionally, the input is padded with two spaces on both sides.

<!--support start-->

| Code | Name | | - | - | | 007 | Sãotomense | | 008 | Crioulo, Upper Guinea (008) | | 009 | Mbundu (009) | | 010 | Tetun Dili | | 011 | Umbundu (011) | | 013 | (Mijisa) | | 014 | (Maiunan) | | 016 | (Minjiang, spoken) | | 017 | (Minjiang, written) | | 020 | Drung | | 021 | (Muzzi) | | 022 | (Klau) | | 025 | (Bizisa) | | 026 | (Yeonbyeon) | | 027 | Gumuz | | 028 | Kafa | | 029 | Sidamo | | 030 | Kituba (2) | | 032 | South Azerbaijani | | 041 | Latvian (2) | | 042 | Spanish (resolution) | | 043 | Zarma | | 044 | Mirandese | | 045 | Maasai | | 046 | Malay, Papuan | | 047 | Malay, Ambonese | | 048 | Minangkabau (2) | | 049 | Banjar | | 050 | (Bataknese) | | 052 | Morisyen | | 053 | Hausa (2) | | 054 | Catalan (2) | | 055 | Jamaican Creole English | | 056 | Saint Lucian Creole French | | 057 | Maay | | 058 | Somali (Af Marka) | | 059 | North Saami (2) | | 060 | Inari Saami | | 061 | Skolt Saami | | 062 | Swahili (Chimwiini) | | 063 | Swahili (Kibajuni) | | 064 | Dabarre | | 065 | Garre | | 066 | Jiiddu | | 067 | Finnish (2) | | 068 | French (Welche) | | 069 | Maori (2) | | 071 | Kabyle | | aar | Afar | | abk | Abkhaz | | ace | Aceh | | acu | Achuar-Shiwiar | | acu_1 | Achuar-Shiwiar (1) | | ada | Dangme | | ady | Adyghe | | afr | Afrikaans | | agr | Aguaruna | | aii | Assyrian Neo-Aramaic | | ajg | Aja | | aka_akuapem | Twi (Akuapem) | | aka_asante | Twi (Asante) | | aka_fante | Fante | | als | Albanian, Tosk | | alt | Altai, Southern | | amc | Amahuaca | | ame | Yaneshaʼ | | amh | Amharic | | ami | Amis | | amr | Amarakaeri | | arb | Arabic, Standard | | arl | Arabela | | arn | Mapudungun | | ast | Asturian | | auc | Waorani | | auv | Occitan (Auvergnat) | | ayo | Ayoreo | | ayr | Aymara, Central | | azj_cyrl | Azerbaijani, North (Cyrillic) | | azj_latn | Azerbaijani, North (Latin) | | bam | [Bamanankan](http

Related Skills

View on GitHub
GitHub Stars25
CategoryDevelopment
Updated3mo ago
Forks3

Languages

JavaScript

Security Score

92/100

Audited on Dec 7, 2025

No findings