NeuralTextToAudio
Text prompt steered synthetic audio generators
Install / Use
/learn @olaviinha/NeuralTextToAudioREADME
Colab notebooks for text-to-audio generators
❗️ This repository is not actively maintained since 2023, as closed-source state-of-the-art text-to-audio solutions are now widely available for everyone.
User-friendly Colab notebooks for various text prompt steered synthetic audio generators.
Available notebooks:
- AudioLDM – text-to-audio
- TorToiSe TTS – text-to-speech w/ voice-cloning
- MubertAI Text-to-Music – text-to-music
- TTS Voice Cloning – text-to-speech w/ voice-cloning
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Paper: Text-to-Audio Generation with Latent Diffusion Models
Colab for AudioLDM. Generates audio based on text description. This is probably the beginning of "Stable Diffusion of audio". Currently capable of producing 16 kHz audio only.
TorToiSe: Text-to-speech
Paper: TorToiSe - Spending Compute for High Quality TTS
Colab for TorToiSe text-to-speech voice-cloning. This notebook takes a text string and an audio file (or files) of a speaker's voice, and attempts to synthesize the text using the given voice. Currently works with English text only.
MubertAI Text-to-Music
UPDATE: it seems like Mubert API now requires (paid) API key.
Colab for MubertAI Text-to-Music. Generates music using predefined blocks created by the community (afaik) based on text description. See the source repository for information, such as licensing.
TTS Voice Cloning
Paper: Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Colab for Real-Time-Voice-Cloning text-to-speech voice-cloning. This notebook takes a text string and an audio file of a speaker's voice, and attempt to synthesize the text using the given voice. Fair warning: results are not great.
Related Skills
qqbot-channel
350.1kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.4k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
350.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
