SkillAgentSearch skills...

AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Install / Use

/learn @AIGC-Audio/AudioGPT
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

arXiv GitHub Stars visitors Hugging Face

We provide our implementation and pretrained models as open source in this repository.

Get Started

Please refer to run.md

Capabilities

Here we list the capability of AudioGPT at this time. More supported models and tasks are coming soon. For prompt examples, refer to asset.

Currently not every model has repository.

Speech

| Task | Supported Foundation Models | Status | |:--------------------------:|:-------------------------------:|:------:| | Text-to-Speech | FastSpeech, SyntaSpeech, VITS | Yes (WIP) | | Style Transfer | GenerSpeech | Yes | | Speech Recognition | whisper, Conformer | Yes | | Speech Enhancement | ConvTasNet | Yes (WIP) | | Speech Separation | TF-GridNet | Yes (WIP) | | Speech Translation | Multi-decoder | WIP | | Mono-to-Binaural | NeuralWarp | Yes |

Sing

| Task | Supported Foundation Models | Status | |:-------------------------:|:-------------------------------:|:------:| | Text-to-Sing | DiffSinger, VISinger | Yes (WIP) |

Audio

| Task | Supported Foundation Models | Status | |:----------------------:|:---------------------------:|:------:| | Text-to-Audio | Make-An-Audio | Yes | | Audio Inpainting | Make-An-Audio | Yes | | Image-to-Audio | Make-An-Audio | Yes | | Sound Detection | Audio-transformer | Yes | | Target Sound Detection | TSDNet | Yes | | Sound Extraction | LASSNet | Yes |

Talking Head

| Task | Supported Foundation Models | Status | |:-------------------------:|:-------------------------------:|:----------:| | Talking Head Synthesis | GeneFace | Yes (WIP) |

Acknowledgement

We appreciate the open source of the following projects:

ESPNetNATSpeechVisual ChatGPTHugging FaceLangChainStable Diffusion

Related Skills

View on GitHub
GitHub Stars10.2k
CategoryContent
Updated1d ago
Forks861

Languages

Python

Security Score

85/100

Audited on Mar 30, 2026

No findings