SparkTTS.cpp
SparkTTS inference with C++
Install / Use
/learn @DarkKowalski/SparkTTS.cppREADME
SparkTTS inference with C++
Windows:
- ONNX Runtime (DirectML backend) for BiCodec/Wav2Vec etc.
- llama.cpp (Vulkan backend) for Qwen2.5-0.5B
macOS:
- CoreML for BiCodec/Wav2Vec etc.
- llama.cpp (Metal backend) for Qwen2.5-0.5B
Performance
With Q4-K quantized transformer, it can achieve Real-Time Factor (RTF) of approximately 0.15 and 300ms first audio sample latency on a NVIDIA RTX 4070 GPU.
How to build
Install Rust
Setup vcpkg
Build llama.cpp
Windows
-
Make sure you are using
x64 Native Tools Command Prompt for VS 2022 -
Setup Vulkan dependencies, llama.cpp build doc
-
Build and install with CMake
cd third_party\llama.cpp
cmake -B build -G Ninja -DGGML_VULKAN=ON -DLLAMA_CURL=OFF -DCMAKE_INSTALL_PREFIX=..\..\lib\llama -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cmake --install build --config Release
cd ..\..
macOS (Apple Silicon)
pushd third_party/llama.cpp
cmake -B build -G Ninja -DLLAMA_CURL=OFF -DCMAKE_INSTALL_PREFIX=../../lib/llama -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cmake --install build --config Release
popd
Build ONNX Runtime (Windows only with DirectML)
cd third_party\onnxruntime
python .\tools\ci_build/build.py ^
--update ^
--build ^
--config Release ^
--build_shared_lib ^
--parallel ^
--build_dir ./build ^
--cmake_extra_defines "CMAKE_POLICY_VERSION_MINIMUM=3.5" ^
--skip_tests ^
--enable_lto ^
--use_dml
cmake --install build\Release --config Release --prefix ..\..\lib\onnxruntime
cd ..\..
Build with CMake and Ninja
Windows
cmake --preset=vcpkg -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cmake --install build --config Release && copy /Y build\src\*.dll install\tools\bin
macOS
cmake --preset=vcpkg -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cmake --install build --config Release
How to use
C API is provided for C++ and other languages.
Example command line tool is provided to for performance tuning.
Acknowledgements
Models used in this project are from SparkAudio/Spark-TTS
Inspired by arghyasur1991/Spark-TTS-Unity
Third-party libraries used in this project:
