SkillAgentSearch skills...

MangaTranslator

Manga translation app powered by AI

Install / Use

/learn @meangrinch/MangaTranslator

README

MangaTranslator

Gradio-based web application for automating the translation of manga/comic page images using AI. Targets speech bubbles and text outside of speech bubbles. Supports 59 languages and custom font pack usage.

<div align="left"> <table> <tr> <th style="text-align: left">Original</th> <th style="text-align: left">Translated (w/ a single click)</th> </tr> <tr> <td><img src="docs/images/example_original.jpg" width="400" /></td> <td><img src="docs/images/example_translation.jpg" width="400" /></td> </tr> </table> </div>

Table of Contents

Features

  • Detection: Speech bubble detection & segmentation (YOLO + SAM 2.1/3)
  • Cleaning: Inpaint speech bubbles and OSB text (Flux.2 Klein, Flux.1 Kontext, or OpenCV)
  • Translation: LLM-powered OCR & translation (59 languages)
  • Rendering: Text rendering with alignment and custom font packs
  • Upscaling: 2x-AnimeSharpV4 for enhanced output quality
  • Processing: Single/batch processing with directory preservation and ZIP support
  • Interfaces: Web UI (Gradio) and CLI
  • Automation: One-click translation; no intervention required

Requirements

  • Python 3.10+
  • PyTorch (CPU, CUDA, ROCm, XPU, MPS)
  • Font pack with .ttf/.otf files; included with portable package
  • LLM for Japanese source text; VLM for other languages (API or local)

Install

Portable Package (Recommended)

Download the standalone zip from the releases page: Portable Build

Requirements:

  • Windows: Bundled Python/Git included; no additional requirements
  • Linux/macOS: Python 3.10+ and Git must be installed on your system

Setup:

  1. Extract the zip file
  2. Run the setup script for your platform:
    • Windows: Double-click setup.bat
    • Linux/macOS: Run ./setup.sh in terminal
  3. PyTorch version is automatically detected and installed based on your system
  4. Open the launcher script created in ./MangaTranslator/:
    • Windows: start-webui.bat
    • Linux/macOS: start-webui.sh

Included font packs:

  • Komika (normal text)
  • Cookies (OSB text)
  • Comicka (either)
  • Roboto (supports accents)
  • Noto Sans SC (supports Simplified Chinese)

[!TIP] In the event that you need to transfer to a fresh portable package:

  • You can safely move the fonts, models, and output directories to the new portable package
  • You might be able to move the runtime directory over, assuming the same setup configuration is wanted

Manual install

  1. Clone and enter the repo
git clone https://github.com/meangrinch/MangaTranslator.git
cd MangaTranslator
  1. Create and activate a virtual environment (recommended)
python -m venv venv
# Windows PowerShell/CMD
.\venv\Scripts\activate
# Linux/macOS
source venv/bin/activate
  1. Install PyTorch (see: PyTorch Install)
# Example (CUDA 13.0)
pip install torch==2.10.0+cu130 torchvision==0.25.0+cu130 --extra-index-url https://download.pytorch.org/whl/cu130
# Example (ROCm 7.1)
pip install torch==2.10.0+rocm7.1 torchvision==0.25.0+rocm7.1 --extra-index-url https://download.pytorch.org/whl/rocm7.1
# Example (XPU)
pip install torch==2.10.0+xpu torchvision==0.25.0+xpu --extra-index-url https://download.pytorch.org/whl/xpu
# Example (MPS/CPU)
pip install torch==2.10.0 torchvision==0.25.0
  1. Install Nunchaku (optional, for Flux.1 Kontext Nunchaku backend)
  • Nunchaku wheels are not on PyPI. Install directly from the v1.2.1 GitHub release URL, matching your OS and Python version. CUDA only, and requires a 2000-series card or newer.
# Example (Windows, Python 3.13, PyTorch 2.10.0, CUDA 13.0)
pip install https://github.com/nunchaku-ai/nunchaku/releases/download/v1.2.1/nunchaku-1.2.1+cu13.0torch2.10-cp313-cp313-win_amd64.whl

[!NOTE] Nunchaku is not necessary for the use of Flux models via the SDNQ backend.

  1. Install dependencies
pip install -r requirements.txt

Post-Install Setup

Models

  • The application will automatically download and use all required models

Fonts

  • Put font packs as subfolders in fonts/ with .otf/.ttf files
  • Prefer filenames that include italic/bold or both so variants are detected
  • Example structure:
fonts/
├─ CC Wild Words/
│  ├─ CCWildWords-Regular.otf
│  ├─ CCWildWords-Italic.otf
│  ├─ CCWildWords-Bold.otf
│  └─ CCWildWords-BoldItalic.otf
└─ Komika/
   ├─ KOMIKA-HAND.ttf
   └─ KOMIKA-HANDBOLD.ttf

LLM setup

  • Providers: Google, OpenAI, Anthropic, xAI, DeepSeek, Z.ai, Moonshot AI, OpenRouter, OpenAI-Compatible
  • Web UI: configure provider/model/key in the Config tab (stored locally)
  • CLI: pass keys/URLs as flags or via env vars
  • Env vars: GOOGLE_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, DEEPSEEK_API_KEY, ZAI_API_KEY, MOONSHOT_API_KEY, OPENROUTER_API_KEY, OPENAI_COMPATIBLE_API_KEY
  • OpenAI-compatible default URL: http://localhost:1234/v1

[!NOTE] YanoljaNEXT-Rosetta models (e.g., yanolja/YanoljaNEXT-Rosetta-4B-2511-GGUF) are automatically detected when used via the OpenAI-Compatible provider and receive optimized prompting. These are text-only models and require two-step + local OCR model. The Special Instructions field is mapped to Rosetta's translation glossary (one entry per line, e.g., Yanolja NEXT -> 야놀자넥스트).

OSB text setup (optional)

If you want to use the OSB text pipeline, you need a Hugging Face token with access to the following repositories:

  • deepghs/AnimeText_yolo
  • black-forest-labs/FLUX.1-Kontext-dev (only required if using Flux.1 Kontext with Nunchaku backend)

Steps to create a token:

  1. Sign in or create a Hugging Face account
  2. Visit and accept the terms on:
  3. Create a new access token in your Hugging Face settings with read access to gated repos ("Read access to contents of public gated repos")
  4. Add the token to the app:
    • Web UI: set hf_token in Config
    • Env var (alternative): set HUGGINGFACE_TOKEN
  5. Save config to preserve the token across sessions

Run

Web UI (Gradio)

  • Portable package:
    • Windows: Double-click start-webui.bat inside the MangaTranslator folder
    • Linux/macOS: Run ./start-webui.sh inside the MangaTranslator folder
  • Manual install:
    • Windows: Run python app.py --open-browser

Options: --models (default ./models), --fonts (default ./fonts), --port (default 7676), --cpu. First launch can take ~1–2 minutes.

Once launched, configure your LLM provider in the Config tab, then upload images and click Translate.

CLI

Examples:

# Single image, Japanese → English, Google provider
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...>

# Batch folder, custom source/target languages, OpenAI-Compatible provider (LM Studio)
python main.py --input <folder_path> --batch \
  --font-dir "fonts/Komika" \
  --input-language <src_lang> --output-language <tgt_lang> \
  --provider OpenAI-Compatible --openai-compatible-url http://localhost:1234/v1 \
  --output ./output

# Single Image, Japanese → English (Google), OSB text pipeline, custom OSB text font
python main.py --input <image_path> \
  --font-dir "fonts/Komika" --provider Google --google-api-key <AI...> \
  --osb-enable --osb-font-dir "fonts/Clementine"

# Cleaning-only mode (no translation/text rendering)
python main.py --input <image_path> --cleaning-only

# Upscaling-only mode (no detection/translation, only upscale)
python main.py --input <image_path> --upscaling-only --image-upscale-mode final --image-upscale-factor 2.0

# Test mode (no translation; render placeholder text)
python main.py --input <image_path> --test-mode

# Full options
python main.py --help

Documentation

Updating

Portable Package

  • Windows: Run update.bat from the portable package root
  • Linux/macOS: Run ./update.sh from the portable package root

Manual Install

From the repo root:

git pull
pip install -r requirements.txt  # Or activate venv first if present

License & credits

<details> <summary><b>ML Models & Libraries</b></summary>
View on GitHub
GitHub Stars167
CategoryDevelopment
Updated3h ago
Forks26

Languages

Python

Security Score

100/100

Audited on Apr 10, 2026

No findings