MagiaTimeline
A general-purpose CV-based framework for extracting precise subtitle timelines from videos with embedded subtitles, from video to .ass file.
Install / Use
/learn @HurryPeng/MagiaTimelineREADME
MagiaTimeline
English | 简体中文
A general-purpose CV-based framework for extracting precise subtitle timelines from videos with embedded subtitles, from video to .ass file.
<img src="./logo/MagiaTimeline-Logo-Transparent.png" width="300">Introduction
Motivation
Many videos have embedded, unstructured subtitle text that is part of the video frames. Typical examples include RPG/GalGame recordings, Yukkuri Introduction (ゆっくり解説) videos, Voiceroid videos, and Anime Conte (アニメコント) videos. Translating these videos—essentially translating their subtitles—requires the added target-language subtitles to sync with the original subtitles. This necessitates extracting the time points when the original subtitles appear, change, and disappear, into a structured subtitle file (.ass, .srt). Traditionally, this extraction is done manually, which is a significant workload in the translation workflow. Voice recognition does not help because some videos have no voiceovers, and even when they do, the dubbing does not necessarily align with the embedded subtitles. MagiaTimeline aims to automate this process to empower individual translators and ultimately foster cultural communication.
Goals
- User-Friendliness: A general-purpose algorithm provided by MagiaTimeline supports various videos without requiring much user tuning.
- Minimum Hardware Requirement: MagiaTimeline does not require a GPU.
- Accuracy: The extracted timeline matches the original subtitles with same-frame accuracy in most cases.
- Performance: The program takes less time to run than the total duration of the video.
- Extendability: MagiaTimeline allows specialized algorithm sets (called Strategies) that speed up processing and generate more accurate results for certain video types, while reusing most of the framework’s core components.
Getting Started
Installing
Prebuilt Binary
Currently, only win_amd64 is supported. No Python installation is needed. You can find the prebuilt package on the release page.
Install from Source
- Install Python 3.12.6 (or above, but < 3.13), 64-bit version.
- Run
install.bat(for Windows) orinstall.sh(for GNU/Linux or macOS). This automatically installs dependencies listed inrequirements.txtinto a Python virtual environment. - (For Extra Job
ocronly, not required for basic timeline generation) Install Tesseract OCR v5.3.3.20231005 (or above).
Workflow
Configuration: Open config.yml to review the configurations. You can open this file with any text editor, but using an IDE that provides syntax checks and formatting is recommended. The detailed comments in config.yml will guide you through the setup.
For beginners, the following settings are recommended:
- Specify the input video file in the
sourcesection. - Set
strategytobcsandpresettodefault. This should already be the default. - Set
enginetospeculative. This should also be the default.
Debug Running: Run in debug mode to check whether the subtitle area is included in the detection window.
Temporarily make these changes to config.yml:
- Set
enginetoframewise. - Under the
framewisesection, setdebugtotrue(this is the default).
Then, run MagiaTimeline.bat/MagiaTimeline.exe (Windows) or MagiaTimeline.sh (GNU/Linux or macOS). A window will pop up showing the frames from your video with a red box at the bottom. The red box should cover the area where the subtitles appear. If it does not, adjust dialogRect under the bcs, default section and rerun. You don’t need to run the entire program in debug mode—quit by typing q in the display window whenever you’re done checking.
If the video’s resolution is too high for the display window, consider adjusting debugPyrDown under the framewise section.
Productive Running: Debug mode significantly slows down the program. After verifying alignment in debug mode, run the full video with debug off. Reset the parameters as described in the Configuration section, then run MagiaTimeline.bat/MagiaTimeline.exe (Windows) or MagiaTimeline.sh (GNU/Linux or macOS). Once the program finishes, you should see a MagiaTimelineOutput.ass file generated. As a video creator, you should already be familiar with this file format. Enjoy!
Toolbox
Strategies
A Strategy is a set of CV algorithms that tells the framework how to identify subtitles in a frame. You can choose which Strategy to use for your video.
MagiaTimeline provides two recommended general-purpose Strategies:
- Box colour stat (
bcs): A CV pipeline consisting of ML-based text detection, color statistics, and feature extraction. This strategy works for most videos without tuning and is the top recommendation. - Outline (
otl): A CV pipeline that relies on predefined parameters for text and outline color and weight to detect subtitles. It is much faster thanbcsin applicable cases, but it requires manual parameter adjustments.
Traditionally, there are also Strategies specialized for certain types of videos. These are fast and accurate but not generalizable:
- Magia Record 「マギアレコード」 《魔法纪录》, for which this project was initially created.
- Magia Exedra
- Limbus Company 「림버스컴퍼니」 《边狱公司》.
- Parako 「私立パラの丸高校」 《超能力高校》.
- BanG Dream! Girls Band Party! 「バンドリ! ガールズバンドパーティ!」 《BanG Dream! 少女乐团派对!》.
Engines
An Engine determines how frames are sampled before they are processed by Strategies. MagiaTimeline provides two Engines:
- Framewise: Processes every frame (or every n-th frame, where n is configurable) to form a linear string of features before connecting them as subtitle intervals. This engine is slow but supports all Strategies. In debug mode, it provides a visual window showing the currently processed frame, useful for debugging and tuning parameters.
- Speculative: Processes fewer frames by skipping those unlikely to contain subtitle changes. This engine is about 20× faster than the Framewise engine but only supports the
bcsandotlgeneral-purpose strategies.
Troubleshooting
Enabling Entire-Frame Detection
By default, the bcs strategy only detects subtitles in the bottom quarter of the screen (dialogRect starts at 0.75). This is because the upper part of the screen often contains text or other content that may interfere with subtitle detection.
If your video has subtitles above this region and you need full-screen detection, open config.yml. Under the bcs: → default: section, locate the dialogRect: parameter and change the 0.75 to 0.00. This allows subtitle detection across the entire frame.
Black-and-White Subtitles
If your video features black-and-white (or gray) subtitles, you may find the bcs strategy difficult to detect them. MagiaTimeline assigns a lower priority to pure grayscale colors (black/white/gray), which helps reduce false positives but can interfere with accurately picking up purely black-and-white subtitles.
To improve black-and-white subtitle detection, open config.yml and go to bcs: → default:. Find maxGreyscalePenalty: 0.70 and change it to 0.00. This removes the penalty for grayscale pixels and may significantly improve the recognition of black-and-white subtitles.
Architecture
MagiaTimeline adopts a compiler-like architecture. The source video is analyzed and transformed into intermediate representations (IR), and optimization passes are applied to these IRs to improve timeline quality.
For example, in the MagiaRecordStrategy, the process involves:
-
Frame-Wise Computer Vision (CV) Analysis
- Detect subtitles in each frame and tag them as one of the following:
- Dialog
- Blackscreen
- Whitescreen
- CG Sub
- Detect subtitles in each frame and tag them as one of the following:
-
Frame Point Intermediate Representation (FPIR)
- Each frame is represented by a Frame Point (FP) with attributes:
- Frame index
- Timestamp
- Flags
- Passes:
- Noise filtering
- Interval building
- Each frame is represented by a Frame Point (FP) with attributes:
-
Interval Intermediate Representation (IIR)
- Each subtitle is represented by an Interval with attributes:
- Time range
- Flags
- Passes:
- Gap filling
- ASS formatting
- Each subtitle is represented by an Interval with attributes:
Acknowledgements
Early-Stage Users
- 水银h2oag, who consistently worked on translating Magia Record videos into Chinese.
- 冰柠初夏_lemon, who marked timelines for Magia Record videos, tested, and promoted this tool.
- 啊哈哈QAQ, who also worked on Magia Record videos.
- 行光 (QQ: 2263221094), who led 都市零协会汉化组 in translating games by Project Moon.
- 灰色渔歌, who also works on games by Project Moon.
- 甜隐君子, who consistently translates Parako into Chinese.
Temporary Collaborators
License
This project is licensed under the GNU General Public License v3.0 (GPLv3). You are free to use, modify, and distribute this software for both commercial and non-commercial purposes, provided you comply with the GPLv3 terms and release any derivative works under the same license. When using this framework in your video production workflow, please mention “Subtitles timed by MagiaTimeline” in the credits or description. This software is provided without any warranty.
Related Skills
qqbot-channel
351.8kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
351.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
project-overview
FlightPHP Skeleton Project Instructions This document provides guidelines and best practices for structuring and developing a project using the FlightPHP framework. Instructions for AI Coding A
