8 skills found
microsoft / SpeechT5Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
slp-rl / SlamkitSlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"
dreamtheater123 / Awesome SpeechLM SurveyGithub repository for ACL 2025 paper: Recent Advances in Speech Language Models: A Survey.
elyxlz / VoxtralVoxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GPT4o (closed) or Moshi (complex), it's open, simple, natural.
soumimaiti / Speechlmscore ToolNo description available
Meetween / SpeechlmmMultimodal and multilingual foundation models supporting audio, video and text.
SirryChen / SpeechMedAssistThe first medical SpeechLM, open-sourced with data and code of training, inference, and evaluation.
msh9184 / NeMo Speechlm2No description available