SkillAgentSearch skills...

Abogen

Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

Install / Use

/learn @denizsafak/Abogen

README

abogen <img width="40px" title="abogen icon" src="https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/abogen/assets/icon.ico" align="right" style="padding-left: 10px; padding-top:5px;">

Build Status GitHub Release Abogen PyPi Python Versions Operating Systems PyPi Total Downloads Code style: black License: MIT

<a href="https://trendshift.io/repositories/14433" target="_blank"><img src="https://trendshift.io/api/badge/repositories/14433" alt="denizsafak%2Fabogen | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

Abogen is a powerful text-to-speech conversion tool that makes it easy to turn ePub, PDF, text, markdown, or subtitle files into high-quality audio with matching subtitles in seconds. Use it for audiobooks, voiceovers for Instagram, YouTube, TikTok, or any project that needs natural-sounding text-to-speech, using Kokoro-82M.

<img title="Abogen Main" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen.png' width="380"> <img title="Abogen Processing" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen2.png' width="380">

Demo

https://github.com/user-attachments/assets/094ba3df-7d66-494a-bc31-0e4b41d0b865

This demo was generated in just 5 seconds, producing ∼1 minute of audio with perfectly synced subtitles. To create a similar video, see the demo guide.

How to install? <a href="https://pypi.org/project/abogen/" target="_blank"><img src="https://img.shields.io/pypi/pyversions/abogen" alt="Abogen Compatible PyPi Python Versions" align="right" style="margin-top:6px;"></a>

Windows

Go to espeak-ng latest release download and run the *.msi file.

<b>OPTION 1: Install using script</b>

  1. Download the repository
  2. Extract the ZIP file
  3. Run WINDOWS_INSTALL.bat by double-clicking it

This method handles everything automatically - installing all dependencies including CUDA in a self-contained environment without requiring a separate Python installation. (You still need to install espeak-ng.)

[!NOTE] You don't need to install Python separately. The script will install Python automatically.

<b>OPTION 2: Install using uv</b>

First, install uv if you haven't already.

# For NVIDIA GPUs (CUDA 12.8) - Recommended
uv tool install --python 3.12 abogen[cuda] --extra-index-url https://download.pytorch.org/whl/cu128 --index-strategy unsafe-best-match

# For NVIDIA GPUs (CUDA 12.6) - Older drivers
uv tool install --python 3.12 abogen[cuda126] --extra-index-url https://download.pytorch.org/whl/cu126 --index-strategy unsafe-best-match

# For NVIDIA GPUs (CUDA 13.0) - Newer drivers
uv tool install --python 3.12 abogen[cuda130] --extra-index-url https://download.pytorch.org/whl/cu130 --index-strategy unsafe-best-match

# For AMD GPUs or without GPU - If you have AMD GPU, you need to use Linux for GPU acceleration, because ROCm is not available on Windows.
uv tool install --python 3.12 abogen
<details> <summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Create a virtual environment (optional)
mkdir abogen && cd abogen
python -m venv venv
venv\Scripts\activate

# For NVIDIA GPUs:
# We need to use an older version of PyTorch (2.8.0) until this issue is fixed: https://github.com/pytorch/pytorch/issues/166628
pip install torch==2.8.0+cu128 torchvision==0.23.0+cu128 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128

# For AMD GPUs:
# Not supported yet, because ROCm is not available on Windows. Use Linux if you have AMD GPU.

# Install abogen
pip install abogen
</details>

Mac

First, install uv if you haven't already.

# Install espeak-ng
brew install espeak-ng

# For Silicon Mac (M1, M2 etc.)
uv tool install --python 3.13 abogen --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"

# For Intel Mac
uv tool install --python 3.12 abogen --with "kokoro @ git+https://github.com/hexgrad/kokoro.git,numpy<2"
<details> <summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Install espeak-ng
brew install espeak-ng

# Create a virtual environment (recommended)
mkdir abogen && cd abogen
python3 -m venv venv
source venv/bin/activate

# Install abogen
pip3 install abogen

# For Silicon Mac (M1, M2 etc.)
# After installing abogen, we need to install Kokoro's development version which includes MPS support.
pip3 install git+https://github.com/hexgrad/kokoro.git
</details>

Linux

First, install uv if you haven't already.

# Install espeak-ng
sudo apt install espeak-ng # Ubuntu/Debian
sudo pacman -S espeak-ng # Arch Linux
sudo dnf install espeak-ng # Fedora

# For NVIDIA GPUs or without GPU - No need to include [cuda] in here.
uv tool install --python 3.12 abogen

# For AMD GPUs (ROCm 6.4)
uv tool install --python 3.12 abogen[rocm] --extra-index-url https://download.pytorch.org/whl/nightly/rocm6.4 --index-strategy unsafe-best-match
<details> <summary><b>Alternative: Install using pip (click to expand)</b></summary>
# Install espeak-ng
sudo apt install espeak-ng # Ubuntu/Debian
sudo pacman -S espeak-ng # Arch Linux
sudo dnf install espeak-ng # Fedora

# Create a virtual environment (recommended)
mkdir abogen && cd abogen
python3 -m venv venv
source venv/bin/activate

# Install abogen
pip3 install abogen

# For NVIDIA GPUs:
# Already supported, no need to install CUDA separately.

# For AMD GPUs:
# After installing abogen, we need to uninstall the existing torch package
pip3 uninstall torch 
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.4
</details>

See How to fix "CUDA GPU is not available. Using CPU" warning?

See How to fix "WARNING: The script abogen-cli is installed in '/home/username/.local/bin' which is not on PATH" error in Linux?

See How to fix "No matching distribution found" error?

See How to fix "[WinError 1114] A dynamic link library (DLL) initialization routine failed" error?

Special thanks to @hg000125 for his contribution in #23. AMD GPU support is possible thanks to his work.

Interfaces

Abogen offers two interfaces, but currently they have different feature sets. The Web UI contains newer features that are still being integrated into the desktop application.

| Command | Interface | Features | |---------|-----------|----------| | abogen | PyQt6 Desktop GUI | Stable core features | | abogen-web | Flask Web UI | Core features + Supertonic TTS, LLM Normalization, Audiobookshelf Integration and more! |

Note: The Web UI is under active development. We are working to integrate these new features into the PyQt desktop app. until then, the Web UI provides the most feature-rich experience.

Special thanks to @jeremiahsb for making this possible! I was honestly surprised by his massive contribution (>55,000 lines!) that brought the entire Web UI to life.

🖥️ Desktop Application (PyQt)

How to run?

You can simply run this command to start Abogen Desktop GUI:

abogen

[!TIP] If you installed Abogen using the Windows installer (WINDOWS_INSTALL.bat), It should have created a shortcut in the same folder, or your desktop. You can run it from there. If you lost the shortcut, Abogen is located in python_embedded/Scripts/abogen.exe. You can run it from there directly.

How to use?

  1. Drag and drop any ePub, PDF, text, markdown, or subtitle file (or use the built-in text editor)
  2. Configure the settings:
    • Set speech speed
    • Select a voice (or create a custom voice using voice mixer)
    • Select subtitle generation style (by sentence, word, etc.)
    • Select output format
    • Select where to save the output
  3. Hit Start

In action

<img title="Abogen in action" src='https://raw.githubusercontent.com/denizsafak/abogen/refs/heads/main/demo/abogen.gif'>

Here’s Abogen in action: in this demo, it processes ∼3,000 characters of text in just 11 seconds and turns it into 3 minutes and 28 seconds of audio, and I have a low-end RTX 2060 Mobile laptop GPU. Your results may vary depending on your hardware.

Configuration

| Options | Description | |---------|-------------| | Input Box | Drag and drop ePub, PDF, .TXT, .MD, .SRT, .ASS or .VTT files (or use built-in text editor) | | Queue options | Add multiple files to a queue and process them in batch, with individual settings for each file. S

Related Skills

View on GitHub
GitHub Stars4.2k
CategoryMarketing
Updated1h ago
Forks261

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings