Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Install / Use
/learn @readbeyond/AeneasREADME
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
- Version: 1.7.3
- Date: 2017-03-15
- Developed by: ReadBeyond
- Lead Developer: Alberto Pettarin
- License: the GNU Affero General Public License Version 3 (AGPL v3)
- Contact: aeneas@readbeyond.it
- Quick Links: Home - GitHub - PyPI - Docs - Tutorial - Benchmark - Mailing List - Web App
Goal
aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
For example, given this text file and this audio file, aeneas determines, for each fragment, the corresponding time interval in the audio file:
1 => [00:00:00.000, 00:00:02.640]
From fairest creatures we desire increase, => [00:00:02.640, 00:00:05.880]
That thereby beauty's rose might never die, => [00:00:05.880, 00:00:09.240]
But as the riper should by time decease, => [00:00:09.240, 00:00:11.920]
His tender heir might bear his memory: => [00:00:11.920, 00:00:15.280]
But thou contracted to thine own bright eyes, => [00:00:15.280, 00:00:18.800]
Feed'st thy light's flame with self-substantial fuel, => [00:00:18.800, 00:00:22.760]
Making a famine where abundance lies, => [00:00:22.760, 00:00:25.680]
Thy self thy foe, to thy sweet self too cruel: => [00:00:25.680, 00:00:31.240]
Thou that art now the world's fresh ornament, => [00:00:31.240, 00:00:34.400]
And only herald to the gaudy spring, => [00:00:34.400, 00:00:36.920]
Within thine own bud buriest thy content, => [00:00:36.920, 00:00:40.640]
And tender churl mak'st waste in niggarding: => [00:00:40.640, 00:00:43.640]
Pity the world, or else this glutton be, => [00:00:43.640, 00:00:48.080]
To eat the world's due, by the grave and thee. => [00:00:48.080, 00:00:53.240]

This synchronization map can be output to file in several formats, depending on its application:
- research: Audacity (AUD), ELAN (EAF), TextGrid;
- digital publishing: SMIL for EPUB 3;
- closed captioning: SubRip (SRT), SubViewer (SBV/SUB), TTML, WebVTT (VTT);
- Web: JSON;
- further processing: CSV, SSV, TSV, TXT, XML.
System Requirements, Supported Platforms and Installation
System Requirements
- a reasonably recent machine (recommended 4 GB RAM, 2 GHz 64bit CPU)
- Python 2.7 (Linux, OS X, Windows) or 3.5 or later (Linux, OS X)
- FFmpeg
- eSpeak
- Python packages
BeautifulSoup4,lxml, andnumpy - Python headers to compile the Python C/C++ extensions (optional but strongly recommended)
- A shell supporting UTF-8 (optional but strongly recommended)
Supported Platforms
aeneas has been developed and tested on Debian 64bit, with Python 2.7 and Python 3.5, which are the only supported platforms at the moment. Nevertheless, aeneas has been confirmed to work on other Linux distributions, Mac OS X, and Windows. See the PLATFORMS file for details.
If installing aeneas natively on your OS proves difficult, you are strongly encouraged to use aeneas-vagrant, which provides aeneas inside a virtualized Debian image running under VirtualBox and Vagrant, which can be installed on any modern OS (Linux, Mac OS X, Windows).
Installation
All-in-one installers are available for Mac OS X and Windows, and a Bash script for deb-based Linux distributions (Debian, Ubuntu) is provided in this repository. It is also possible to download a VirtualBox+Vagrant virtual machine. Please see the INSTALL file for detailed, step-by-step installation procedures for different operating systems.
The generic OS-independent procedure is simple:
-
Make sure the following executables can be called from your shell:
espeak,ffmpeg,ffprobe,pip, andpython -
First install
numpywithpipand thenaeneas(this order is important):pip install numpy pip install aeneas -
To check whether you installed aeneas correctly, run:
python -m aeneas.diagnostics
Usage
-
Run without arguments to get the usage message:
python -m aeneas.tools.execute_task python -m aeneas.tools.execute_jobYou can also get a list of live examples that you can immediately run on your machine thanks to the included files:
python -m aeneas.tools.execute_task --examples python -m aeneas.tools.execute_task --examples-all -
To compute a synchronization map
map.jsonfor a pair (audio.mp3,text.txtin plain text format), you can run:python -m aeneas.tools.execute_task \ audio.mp3 \ text.txt \ "task_language=eng|os_task_file_format=json|is_text_type=plain" \ map.json(The command has been split into lines with
\for visual clarity; in production you can have the entire command on a single line and/or you can use shell variables.)To compute a synchronization map
map.smilfor a pair (audio.mp3, page.xhtml containing fragments marked byidattributes likef001), you can run:python -m aeneas.tools.execute_task \ audio.mp3 \ page.xhtml \ "task_language=eng|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" \ map.smilAs you can see, the third argument (the configuration string) specifies the parameters controlling the I/O formats and the processing options for the task. Consult the documentation for details.
-
If you have several tasks to process, you can create a job container to batch process them:
python -m aeneas.tools.execute_job job.zip output_directoryFile
job.zipshould contain aconfig.txtorconfig.xmlconfiguration file, providing aeneas with all the information needed to parse the input assets and format the output sync map files. Consult the documentation for details.
The documentation contains a highly suggested tutorial which explains how to use the built-in command line tools.
Documentation and Support
- Documentation: http://www.readbeyond.it/aeneas/docs/
- Command line tools tutorial: http://www.readbeyond.it/aeneas/docs/clitutorial.html
- Library tutorial: http://www.readbeyond.it/aeneas/docs/libtutorial.html
- Old, verbose tutorial: A Practical Introduction To The aeneas Package
- Mailing list: https://groups.google.com/d/forum/aeneas-forced-alignment
- Changelog: http://www.readbeyond.it/aeneas/docs/changelog.html
- High level description of how aeneas works: HOWITWORKS
- Development history: HISTORY
- Testing: TESTING
- Benchmark suite: https://readbeyond.github.io/aeneas-benchmark/
Supported Features
- Input text files in
parsed,plain,subtitles, orunparsed(XML) format - Multilevel input text files in
mplainandmunparsed(XML) format - Text extraction from XML (e.g., XHTML) files using
idandclassattributes - Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
- Input audio file formats: all those readable by
ffmpeg - Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TEXTGRID, TSV, TTML, TXT, VTT, XML
- Confirmed working on 38 languages: AFR, ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG,
