ChocoTTS

An Ai driven WebSocket interpreter for Dalamud's TextToTalk addon that leverages Coqui Ai TTS for lifelike audio and j-hartmann's emotion transformer model to infer emotion from text.

Generate Convert Improve

Install / Use

/learn @J3sven/ChocoTTS

About this skill

Quality Score

0/100

README

<img src="https://github.com/J3sven/ChocoTTS/blob/main/chocotts/assets/chocotts.png?raw=true" height="56"/> ChocoTTS

ChocoTTS is a WebSocket-based interpreter for the TextToTalk plugin in Dalamud, enabling lifelike text-to-speech (TTS) and emotion inference from text. It uses the 🐸Coqui Ai TTS model for generating speech and j-hartmann's emotion transformer model for detecting emotions in text.

Features

Real-time TTS generation using Coqui Ai models, all generated locally
Emotion inference using j-hartmann's emotion transformer model
Caching of generated speech for faster repeat access
Adjustable audio playback volume
Support for multiple NPCs with different voice samples

Installation

The application is currently still under development, once a stable version 1.0 is ready and installer will be published.

Prerequisites

XIVLauncher (for dalamud)
TextToTalk (dalamud plugin that will provide us with a websocket server to parse text from)
Python 3.10 or higher
ffmpeg (for audio processing)
An NVIDIA GPU is highly recommended

License

This project is licensed under the GNU General Public License. See the LICENSE file for more details.

Related Skills

node-connect

341.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

84.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

341.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

84.6k

Commit, push, and open a PR