Results for "speech-coding"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

653 skills found · Page 1 of 22

SWivid / F5 TTS

14.3k

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

universal

Updated 3h ago

Rudrabha / Wav2Lip

12.9k

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

universal

Updated 1h ago

xorbitsai / Inference

9.2k

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

universal

artificial-intelligencechatglmdeployment+17

Updated 22m ago

Azure-Samples / Cognitive Services Speech SDK

3.4k

Sample code for the Microsoft Cognitive Services Speech SDK

universal

Updated 1d ago

openai / Openai Fm

2.8k

Code for openai.fm, a demo for the OpenAI Speech API

universal

Updated 17h ago

Robitx / Gp.nvim

1.3k

Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]

claude codeclaude desktop+2

claudecodeiumcopilot+17

Updated 1d ago

ddlBoJack / Emotion2vec

1.1k

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

universal

iemocappytorch-implementationspeech-emotion-recognition+1

Updated 15h ago

Azure-Samples / Cognitive Speech TTS

1.0k

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

universal

azure-ttscustom-neural-voicee2etts+11

Updated 7d ago

yiranran / Audio Driven TalkingFace HeadPose

776

Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalized Head Movement From Short Video and Speech Signal" (TMM 2022)

zed

Updated 13d ago

Rudrabha / Lip2Wav

712

This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"

universal

Updated 2d ago

DmitryRyumin / INTERSPEECH 2023 24 Papers

686

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

universal

acousticadaptationasr+17

Updated 9d ago

ZhangXInFD / SpeechTokenizer

650

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

universal

Updated 8d ago

Dadangdut33 / Speech Translate

642

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

universal

pythonspeech-transcriptionspeech-translation+3

Updated 12h ago

DmitryRyumin / ICASSP 2023 24 Papers

523

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

universal

asrdenoisingdomain-adaptation+17

Updated 8d ago

gemengtju / Tutorial Separation

476

This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.

universal

deep-learningdeep-neural-networkssignal-processing+3

Updated 4d ago

YuanGongND / Ltu

473

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

universal

audioaudio-processingdeep-learning+2

Updated 4d ago

facebookresearch / Brainmagick

462

Training and evaluation pipeline for MEG and EEG brain signal encoding and decoding using deep learning. Code for our paper "Decoding speech perception from non-invasive brain recordings" published in Nature Machine Intelligence, 2023.

universal

Updated 11d ago

FireRedTeam / FireRedASR2S

433

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.

universal

asrasr-pipelineaudio-event-classification+15

Updated 5h ago

YuanGongND / Whisper At

412

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

universal

audioaudio-classificationaudio-processing+2

Updated 10d ago

facebookresearch / Meshtalk

401

Code for MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement

universal

Updated 7d ago