Abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Install / Use
/learn @denizsafak/AbogenREADME
abogen <img width="40px" title="abogen icon" src="https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/abogen/assets/icon.ico" align="right" style="padding-left: 10px; padding-top:5px;">
<a href="https://trendshift.io/repositories/14433" target="_blank"><img src="https://trendshift.io/api/badge/repositories/14433" alt="denizsafak%2Fabogen | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
Abogen is a powerful text-to-speech conversion tool that makes it easy to turn ePub, PDF, text, markdown, or subtitle files into high-quality audio with matching subtitles in seconds. Use it for audiobooks, voiceovers for Instagram, YouTube, TikTok, or any project that needs natural-sounding text-to-speech, using Kokoro-82M.
<img title="Abogen Main" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen.png' width="380"> <img title="Abogen Processing" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen2.png' width="380">
Demo
https://github.com/user-attachments/assets/094ba3df-7d66-494a-bc31-0e4b41d0b865
This demo was generated in just 5 seconds, producing ∼1 minute of audio with perfectly synced subtitles. To create a similar video, see the demo guide.
How to install? <a href="https://pypi.org/project/abogen/" target="_blank"><img src="https://img.shields.io/pypi/pyversions/abogen" alt="Abogen Compatible PyPi Python Versions" align="right" style="margin-top:6px;"></a>
Windows
Go to espeak-ng latest release download and run the *.msi file.
<b>OPTION 1: Install using script</b>
- Download the repository
- Extract the ZIP file
- Run
WINDOWS_INSTALL.batby double-clicking it
This method handles everything automatically - installing all dependencies including CUDA in a self-contained environment without requiring a separate Python installation. (You still need to install espeak-ng.)
[!NOTE] You don't need to install Python separately. The script will install Python automatically.
<b>OPTION 2: Install using uv</b>
First, install uv if you haven't already.
# For NVIDIA GPUs (CUDA 12.8) - Recommended
uv tool install --python 3.12 abogen[cuda] --extra-index-url https://download.pytorch.org/whl/cu128 --index-strategy unsafe-best-match
# For NVIDIA GPUs (CUDA 12.6) - Older drivers
uv tool install --python 3.12 abogen[cuda126] --extra-index-url https://download.pytorch.org/whl/cu126 --index-strategy unsafe-best-match
# For NVIDIA GPUs (CUDA 13.0) - Newer drivers
uv tool install --python 3.12 abogen[cuda130] --extra-index-url https://download.pytorch.org/whl/cu130 --index-strategy unsafe-best-match
# For AMD GPUs or without GPU - If you have AMD GPU, you need to use Linux for GPU acceleration, because ROCm is not available on Windows.
uv tool install --python 3.12 abogen
<details>
<summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Create a virtual environment (optional)
mkdir abogen && cd abogen
python -m venv venv
venv\Scripts\activate
# For NVIDIA GPUs:
# We need to use an older version of PyTorch (2.8.0) until this issue is fixed: https://github.com/pytorch/pytorch/issues/166628
pip install torch==2.8.0+cu128 torchvision==0.23.0+cu128 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
# For AMD GPUs:
# Not supported yet, because ROCm is not available on Windows. Use Linux if you have AMD GPU.
# Install abogen
pip install abogen
</details>
Mac
First, install uv if you haven't already.
# Install espeak-ng
brew install espeak-ng
# For Silicon Mac (M1, M2 etc.)
uv tool install --python 3.13 abogen --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"
# For Intel Mac
uv tool install --python 3.12 abogen --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"
<details>
<summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Install espeak-ng
brew install espeak-ng
# Create a virtual environment (recommended)
mkdir abogen && cd abogen
python3 -m venv venv
source venv/bin/activate
# Install abogen
pip3 install abogen
# For Silicon Mac (M1, M2 etc.)
# After installing abogen, we need to install Kokoro's development version which includes MPS support.
pip3 install git+https://github.com/hexgrad/kokoro.git
</details>
Linux
First, install uv if you haven't already.
# Install espeak-ng
sudo apt install espeak-ng # Ubuntu/Debian
sudo pacman -S espeak-ng # Arch Linux
sudo dnf install espeak-ng # Fedora
# For NVIDIA GPUs or without GPU - No need to include [cuda] in here.
uv tool install --python 3.12 abogen
# For AMD GPUs (ROCm 6.4)
uv tool install --python 3.12 abogen[rocm] --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.4 --index-strategy unsafe-best-match
<details>
<summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Install espeak-ng
sudo apt install espeak-ng # Ubuntu/Debian
sudo pacman -S espeak-ng # Arch Linux
sudo dnf install espeak-ng # Fedora
# Create a virtual environment (recommended)
mkdir abogen && cd abogen
python3 -m venv venv
source venv/bin/activate
# Install abogen
pip3 install abogen
# For NVIDIA GPUs:
# Already supported, no need to install CUDA separately.
# For AMD GPUs:
# After installing abogen, we need to uninstall the existing torch package
pip3 uninstall torch
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.4
</details>
See How to fix "CUDA GPU is not available. Using CPU" warning?
See How to fix "[WinError 1114] A dynamic link library (DLL) initialization routine failed" error?
Special thanks to @hg000125 for his contribution in #23. AMD GPU support is possible thanks to his work.
Interfaces
Abogen offers two interfaces, but currently they have different feature sets. The Web UI contains newer features that are still being integrated into the desktop application.
| Command | Interface | Features |
|---------|-----------|----------|
| abogen | PyQt6 Desktop GUI | Stable core features |
| abogen-web | Flask Web UI | Core features + Supertonic TTS, LLM Normalization, Audiobookshelf Integration and more! |
Note: The Web UI is under active development. We are working to integrate these new features into the PyQt desktop app. until then, the Web UI provides the most feature-rich experience.
Special thanks to @jeremiahsb for making this possible! I was honestly surprised by his massive contribution (>55,000 lines!) that brought the entire Web UI to life.
🖥️ Desktop Application (PyQt)
How to run?
You can simply run this command to start Abogen Desktop GUI:
abogen
[!TIP] If you installed Abogen using the Windows installer
(WINDOWS_INSTALL.bat), It should have created a shortcut in the same folder, or your desktop. You can run it from there. If you lost the shortcut, Abogen is located inpython_embedded/Scripts/abogen.exe. You can run it from there directly.
How to use?
- Drag and drop any ePub, PDF, text, markdown, or subtitle file (or use the built-in text editor)
- Configure the settings:
- Set speech speed
- Select a voice (or create a custom voice using voice mixer)
- Select subtitle generation style (by sentence, word, etc.)
- Select output format
- Select where to save the output
- Hit Start
In action
<img title="Abogen in action" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen.gif'>
Here’s Abogen in action: in this demo, it processes ∼3,000 characters of text in just 11 seconds and turns it into 3 minutes and 28 seconds of audio, and I have a low-end RTX 2060 Mobile laptop GPU. Your results may vary depending on your hardware.
Configuration
| Options | Description |
|---------|-------------|
| Input Box | Drag and drop ePub, PDF, .TXT, .MD, .SRT, .ASS or .VTT files (or use built-in text editor) |
| Queue options | Add multiple files to a queue and process them in batch, with individual settings for each file. S
Related Skills
bluebubbles
332.3kUse when you need to send or manage iMessages via BlueBubbles (recommended iMessage integration). Calls go through the generic message tool with channel="bluebubbles".
bear-notes
332.3kCreate, search, and manage Bear notes via grizzly CLI.
claude-ads
1.2kComprehensive paid advertising audit & optimization skill for Claude Code. 186 checks across Google, Meta, YouTube, LinkedIn, TikTok & Microsoft Ads with weighted scoring, parallel agents, and industry templates.
claude-ads
1.2kComprehensive paid advertising audit & optimization skill for Claude Code. 186 checks across Google, Meta, YouTube, LinkedIn, TikTok & Microsoft Ads with weighted scoring, parallel agents, and industry templates.
