56 skills found · Page 1 of 2
bitextor / BitextorBitextor generates translation memories from multilingual websites
averkij / A StudioLingtrain Alignment Studio is an ML based app for texts alignment on different languages. It can produce parallel corpora and parallel books.
ajinkyakulkarni14 / TED Multilingual Parallel CorpusTED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted from TED talks www.ted.com for 109 world languages.
csebuetnlp / BanglanmtThis repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
JeremyCCHsu / Vae NpvcRe-implementation the code used in Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
jungyeul / Korean Parallel CorporaKorean Parallel Corpus
clab / Wikipedia Parallel TitlesTools for extracting parallel corpora from article titles across languages in Wikipedia
tsuruoka-lab / BSDThe Business Scene Dialogue corpus
joshua-decoder / Indian Parallel CorporaNo description available
mahdeslami11 / Voice Conversion From Non Parallel Corpora Using Variational Auto Encoder.No description available
biomedical-translation-corpora / CorporaParallel corpora for the biomedical domain
rpryzant / Proxy A DistanceProxy A-Distance algorithm for measuring domain disparity in parallel corpora
M4t1ss / Parallel Corpora ToolsTools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
averkij / Lingtrain Aligner EditorExtracts parallel corpora from the 2 raw texts in different languages.
Kartikaggarwal98 / Indian ParallelCorpusCurated list of publicly available parallel corpus for Indian Languages
ehsanasgari / 1000LangsCreating super-parallel corpora of more than 1500+ unique languages for NLP research
FrancisGregoire / ParSentExtractA BiRNN framework implemented in Python and TensorFlow to extract parallel sentences from aligned comparable corpora.
twadada / Multilingual NlmCode for "Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models" and "Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora"
darius / Spaced OutVocab drill using parallel corpora, plus classic spaced-repetition drill
timarkh / TsakorpusYet another search platform for linguistic corpora.