EntropyRegularization
Code for Generalized Entropy Regularization paper
Install / Use
/learn @rycolab/EntropyRegularizationREADME
Generalized Entropy Regularization
This library is built on top of fairseq (pytorch).
Generalized entropy regularization can be used with any probabilistic model and data set. Just set the --criterion flag to jensen_cross_entropy and specify --alpha and --beta when running fairseq-train. Specify --use-uniform to use the uniform distribution as the baseline. Otherwise, the unigram distribution with annealing parameter --T will be used (run fairseq-train --help to see all options).
Examples
Neural Machine Translation
Preprocess data following examples in the Translation README. A convolutional model can then be trained on IWSLT'14 (De-En) with the following command:
fairseq-train data-bin/iwslt14.tokenized.de-en \
--arch fconv_iwslt_de_en --max-tokens 4000 --update-freq 8 \
--clip-norm 0.1 --dropout 0.2 --criterion jensen_cross_entropy \
--lr-scheduler fixed --min-lr 1e-8 --lr 0.5 --alpha 0.5 \
--beta 0.7 --use-uniform
Likewise, a Transformer can be trained as follows:
fairseq-train data-bin/iwslt14.tokenized.de-en \
--arch transformer_iwslt_de_en --share-decoder-input-output-embed \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
--lr 5e-4 --lr-scheduler inverse_sqrt --warmup-updates 4000 \
--dropout 0.3 --weight-decay 0.0001 --max-tokens 4000 \
--criterion jensen_cross_entropy --alpha 0.5 \
--beta 0.7 --use-uniform
Generation is the same as in the Fairseq Translation README.
Abstractive Summarization
Download the CNN/DailyMail data set according to the BART README. Follow their suggested training method, setting --criterion to jensen_cross_entropy and specifying --alpha, --beta, and --use-uniform (if desired).
Other models can be trained using the same methodology as above. See fairseq documentation for more options.
Requirements and Installation
- PyTorch version >= 1.2.0
- Python version >= 3.6
- For training new models, you'll also need an NVIDIA GPU and NCCL
- For faster training install NVIDIA's apex library with the
--cuda_extand--deprecated_fused_adamoptions
Installation:
git clone https://github.com/rycolab/entropyRegularization
cd entropy_regularization
pip install --editable .
Citation
This code is for the paper Generalized Entropy Regularization or: There’s Nothing Special about Label Smoothing featured in ACL 2020. Please cite as:
@inproceedings{meister+al.acl20,
title = {Generalized Entropy Regularization or: {T}here's Nothing Special about Label Smoothing},
author = {Meister, Clara and
Salesky, Elizabeth and
Cotterell, Ryan},
booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
month = {July},
year = {2020},
address = {Seattle, USA},
publisher = {Association for Computational Linguistics},
}
Feel free to include the fairseq citation as well; they're awesome.
@inproceedings{ott-etal-2019-fairseq,
title = "fairseq: {A} Fast, Extensible Toolkit for Sequence Modeling",
author = "Ott, Myle and
Edunov, Sergey and
Baevski, Alexei and
Fan, Angela and
Gross, Sam and
Ng, Nathan and
Grangier, David and
Auli, Michael",
booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics (Demonstrations)",
month = jun,
year = "2019",
address = "Minneapolis, Minnesota",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/N19-4009",
doi = "10.18653/v1/N19-4009",
pages = "48--53",
}
Related Skills
node-connect
352.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
