Joeynmt

Minimalist NMT for educational purposes

Generate Convert Improve

Install / Use

/learn @joeynmt/Joeynmt

About this skill

Quality Score

0/100

README

Joey NMT

Goal and Purpose

:koala: Joey NMT framework is developed for educational purposes. It aims to be a clean and minimalistic code base to help novices find fast answers to the following questions.

:grey_question: How to implement classic NMT architectures (RNN and Transformer) in PyTorch?
:grey_question: What are the building blocks of these architectures and how do they interact?
:grey_question: How to modify these blocks (e.g. deeper, wider, ...)?
:grey_question: How to modify the training procedure (e.g. add a regularizer)?

In contrast to other NMT frameworks, we will not aim for the most recent features or speed through engineering or training tricks since this often goes in hand with an increase in code complexity and a decrease in readability. :eyes:

However, Joey NMT re-implements baselines from major publications.

Check out the detailed documentation :books: and our paper. :newspaper:

Contributors

Joey NMT was initially developed and is maintained by Jasmijn Bastings (University of Amsterdam) and Julia Kreutzer (Heidelberg University), now both at Google Research. Mayumi Ohta at Fraunhofer Institute is continuing the legacy.

Welcome to our new contributors :hearts:, please don't hesitate to open a PR or an issue if there's something that needs improvement!

Features

Joey NMT implements the following features (aka the minimalist toolkit of NMT :wrench:):

Recurrent Encoder-Decoder with GRUs or LSTMs
Transformer Encoder-Decoder
Attention Types: MLP, Dot, Multi-Head, Bilinear
Word-, BPE- and character-based tokenization
BLEU, ChrF evaluation
Beam search with length penalty and greedy decoding
Customizable initialization
Attention visualization
Learning curve plotting
Scoring hypotheses and references
Multilingual translation with language tags

Installation

Joey NMT is built on PyTorch. Please make sure you have a compatible environment. We tested Joey NMT v2.3 with

python 3.11
torch 2.1.2
cuda 12.1

:warning: Warning When running on GPU you need to manually install the suitable PyTorch version for your CUDA version. For example, you can install PyTorch 2.1.2 with CUDA v12.1 as follows:
python -m pip install --upgrade torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121
See PyTorch installation instructions.

You can install Joey NMT either A. via pip or B. from source.

A. Via pip (the latest stable version)

python -m pip install joeynmt

B. From source (for local development)

git clone https://github.com/joeynmt/joeynmt.git  # Clone this repository
cd joeynmt
python -m pip install -e .  # Install Joey NMT and it's requirements
python -m unittest  # Run the unit tests

:memo: Info For Windows users, we recommend to check whether txt files (i.e. test/data/toy/*) have utf-8 encoding.

Changelog

v2.3

introduced DistributedDataParallel.
implemented language tags, see notebooks/torchhub.ipynb
released a iwslt14 de-en-fr multilingual model (trained using DDP)
special symbols definition refactoring
configuration refactoring
autocast refactoring
bugfixes
upgrade to python 3.11, torch 2.1.2
documentation refactoring

<details><summary>previous releases</summary>

v2.2.1

compatibility with torch 2.0 tested
configurable activation function #211
bug fix #207

v2.2

compatibility with torch 1.13 tested
torchhub introduced
bugfixes, minor refactoring

v2.1

upgrade to python 3.10, torch 1.12
replace Automated Mixed Precision from NVIDA's amp to Pytorch's amp package
replace discord.py with pycord in the Discord Bot demo
data iterator refactoring
add wmt14 ende / deen benchmark trained on v2 from scratch
add tokenizer tutorial
minor bugfixes

v2.0 Breaking change!

upgrade to python 3.9, torch 1.11
torchtext.legacy dependencies are completely replaced by torch.utils.data
joeynmt/tokenizers.py: handles tokenization internally (also supports bpe-dropout!)
joeynmt/datasets.py: loads data from plaintext, tsv, and huggingface's datasets
scripts/build_vocab.py: trains subwords, creates joint vocab
enhancement in decoding
- scoring with hypotheses or references
- repetition penalty, ngram blocker
- attention plots for transformers
yapf, isort, flake8 introduced
bugfixes, minor refactoring

:warning: Warning The models trained with Joey NMT v1.x can be decoded with Joey NMT v2.0. But there is no guarantee that you can reproduce the same score as before.

v1.4

upgrade to sacrebleu 2.0, python 3.7, torch 1.8
bugfixes

v1.3

upgrade to torchtext 0.9 (torchtext -> torchtext.legacy)
n-best decoding
demo colab notebook

v1.0

Multi-GPU support
fp16 (half precision) support

</details>

Documentation & Tutorials

We also updated the documentation thoroughly for Joey NMT 2.0!

For details, follow the tutorials in notebooks dir.

v2.x

quick start with joeynmt2 This quick start guide walks you step-by-step through the installation, data preparation, training, and evaluation.
torch hub interface How to generate translation from a pretrained model

v1.x

demo notebook
starter notebook Masakhane - Machine Translation for African Languages in masakhane-io
joeynmt toy models Collection of Joey NMT scripts by @bricksdont

Usage

:warning: Warning For Joey NMT v1.x, please refer the archive here.

Joey NMT has 3 modes: train, test, and translate, and all of them takes a YAML-style config file as argument. You can find examples in the configs directory. transformer_small.yaml contains a detailed explanation of configuration options.

Most importantly, the configuration contains the description of the model architecture (e.g. number of hidden units in the encoder RNN), paths to the training, development and test data, and the training hyperparameters (learning rate, validation frequency etc.).

:memo: Info Note that subword model training and joint vocabulary creation is not included in the 3 modes above, has to be done separately. We provide a script that takes care of it: scritps/build_vocab.py.
python scripts/build_vocab.py configs/transformer_small.yaml --joint

`train` mode

For training, run

python -m joeynmt train configs/transformer_small.yaml

This will train a model on the training data, validate on validation data, and store model parameters, vocabularies, validation outputs. All needed information should be specified in the data, training and model sections of the config file (here configs/transformer_small.yaml).

model_dir/
├── *.ckpt          # checkpoints
├── *.hyps          # translated texts at validation
├── config.yaml     # config file
├── spm.model       # sentencepiece model / subword-nmt codes file
├── src_vocab.txt   # src vocab
├── trg_vocab.txt   # trg vocab
├── train.log       # train log
└── validation.txt  # validation scores

:bulb: Tip Be careful not to overwrite model_dir, set overwrite: False in the config file.

`test` mode

This mode will generate translations for validation and test set (as specified in the configuration) in model_dir/out.[dev|test].

python -m joeynmt test configs/transformer_small.yaml

You can specify the ckpt path explicitly in the config file. If load_model is not given in the config, the best model in model_dir will be used to generate translations.

You can specify i.e. sacrebleu options in the test section of the config file.

:bulb: Tip scripts/average_checkpoints.py will generate averaged checkpoints for you.
python scripts/average_checkpoints.py --inputs model_dir/*00.ckpt --output model_dir/avg.ckpt

If you want to output the log-probabilities of the hypotheses or references, you can specify return_score: 'hyp' or return_score: 'ref' in the testing section of the config. And run test with --output_path and --save_scores options.

python -m joeynmt test configs/transformer_small.yaml --output-path model_dir/pred --save-scores

This will generate model_dir/pred.{dev|test}.{scores|tokens} which contains scores and corresponding tokens.

:memo: Info

If you set return_score: 'hyp' with greedy decoding, then token-wise scores

Related Skills

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding

last30days-skill

17.2k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

joeynmt

View profile

View on GitHub

GitHub Stars713

CategoryEducation

Updated24d ago

Forks224

joeynmt/joeynmt

Languages

Python

Security Score

100/100

Audited on Mar 8, 2026

No findings