Voicecloner
Voice cloning desktop app using Qwen3-TTS - Rust/Iced frontend with Python/FastAPI backend
Install / Use
/learn @adibhanna/VoiceclonerREADME
VoiceCloner
A desktop application for voice design and voice cloning powered by Qwen3-TTS.
Features
- Voice Design - Create custom voices from natural language descriptions
- Voice Cloning - Clone any voice from just 3 seconds of audio
- Built-in Recording - Record your voice directly in the app
- 10 Languages - English, Chinese, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian
- Local Processing - All AI processing happens on your machine
Quick Start
Prerequisites
- macOS 11+, Windows 10+, or Linux
- Python 3.11+
- Rust (for building from source)
- NVIDIA GPU with 8GB+ VRAM (recommended) or CPU (slower)
Development Setup
# Clone the repository
git clone https://github.com/adibhanna/voicecloner
cd voicecloner
# Run setup script (installs Python deps, builds Rust)
./scripts/setup.sh
# Run the app
cargo run
Build Release App (macOS)
# Build the app bundle
./scripts/build-macos.sh
# The app will be at: target/bundle/VoiceCloner.app
open target/bundle/VoiceCloner.app
# Optionally create a DMG for distribution
./scripts/create-dmg.sh
How It Works
When you launch VoiceCloner:
- The app automatically starts a local Python backend server
- The backend loads Qwen3-TTS models (auto-downloaded on first use)
- You can design voices, clone voices, or generate speech
- All processing happens locally on your machine
System Requirements
Minimum (CPU mode)
- 16GB RAM (32GB recommended for 1.7B models)
- 15GB free disk space
- Microphone for voice cloning
Recommended (GPU mode)
- NVIDIA GPU with 16GB+ VRAM (for 1.7B models)
- NVIDIA GPU with 8GB+ VRAM (for 0.6B models)
- 32GB RAM
- 20GB free disk space
Project Structure
voicecloner/
├── src/ # Rust frontend (iced GUI)
│ ├── main.rs
│ ├── app.rs # Main application
│ ├── ui/ # UI panels
│ ├── audio/ # Recording & playback
│ ├── backend/ # Backend client & process manager
│ └── state/ # App state & persistence
├── backend/ # Python backend (FastAPI + Qwen3-TTS)
│ ├── main.py # API server
│ ├── tts_engine.py # TTS model wrapper
│ └── requirements.txt
└── scripts/ # Build scripts
├── setup.sh
├── build-macos.sh
└── create-dmg.sh
Acknowledgments
This project is powered by Qwen3-TTS from Qwen Team. Qwen3-TTS provides:
- High-quality text-to-speech synthesis
- Voice cloning from short audio samples
- Voice design from natural language descriptions
- Support for 10+ languages
License
MIT
