SkillAgentSearch skills...

Simulst

PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.

Install / Use

/learn @George0828Zhang/Simulst

README

Simultaneous Speech Translation

Code base for simultaneous speech translation experiments. It is based on fairseq.

Implemented

Encoder

Streaming Models

Setup

  1. Install fairseq
git clone https://github.com/pytorch/fairseq.git
cd fairseq
git checkout 4a7835b
python setup.py build_ext --inplace
pip install .
  1. (Optional) Install apex for faster mixed precision (fp16) training.
  2. Install dependencies
pip install -r requirements.txt
  1. Update submodules
git submodule update --init --recursive

Pre-trained model

ASR model with Emformer encoder and Transformer decoder. Pre-trained with joint CTC cross-entropy loss. |MuST-C (WER)|en-de (V2)|en-es| |-|-|-| |dev|9.65|14.44| |tst-COMMON|12.85|14.02| |model|download|download| |vocab|download|download|

Sequence-level Knowledge Distillation

|MuST-C (BLEU)|en-de (V2)| |-|-| |valid|31.76| |distillation|download| |vocab|download|

Citation

Please consider citing our paper:

@inproceedings{chang22f_interspeech,
  author={Chih-Chiang Chang and Hung-yi Lee},
  title={{Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={5175--5179},
  doi={10.21437/Interspeech.2022-10627}
}
View on GitHub
GitHub Stars25
CategoryDevelopment
Updated1y ago
Forks4

Languages

Python

Security Score

65/100

Audited on Mar 12, 2025

No findings