Autovot
Trainable algorithm for automatic measurement of voice onset time
Install / Use
/learn @mlml/AutovotREADME
AutoVOT
Joseph Keshet (joseph.keshet@biu.ac.il)
Morgan Sonderegger (morgan.sonderegger@mcgill.ca)
Thea Knowles (thea.knowles@gmail.com)
AutoVOT is a software package for automatic measurement of positive voice onset time (VOT), using an algorithm which is trained to mimic VOT measurement by human annotators. It works as follows:
- The user provides wav files containing a number of stop consonants, and corresponding Praat TextGrids containing some information about roughly where each stop consonant is located.
- A classifier is used to find the VOT for each stop consonant, and add a new tier to each TextGrid containing these measurements.
- The user can either use a pre-existing classifier, or (recommended) train a new one using a small number (~100) of manually-labeled VOTs from their own data.
This is a beta version of AutoVOT. Any reports of bugs, comments on how to improve the software or documentation, or questions are greatly appreciated, and should be sent to the authors at the addresses given above.
Please note that at this time AutoVOT does not support predictions of negative VOT. Please see the Dr.VOT system if this is of interest to you.
For a quick-start, first download and compile the code then go to the tutorial section to begin.
<a name="toc"/>Table of Contents
1. Setting up
2. Usage
3. Tutorial
<a name="settingup"/>Setting Up
Dependencies
In order to use AutoVOT you'll need the following installed in addition to the source code provided here:
-
To install python dependences, please run the command
pip install -r "requirements.txt"from the main directory of the repository. You may also install each dependency separately usingpip install [package name] -
If you're using Mac OS X you'll need to download GCC, as it isn't installed by default. You can either:
- Install Xcode, then install Command Line Tools using the Components tab of the Downloads preferences panel.
- Download the Command Line Tools for Xcode as a stand-alone package.
You will need a registered Apple ID to download either package.
Downloading and Installing
What is included in the download?
<a name="praatsetup"/>Praat plugin installation
Download the latest Praat plugin installer from the releases page
Double click on the installer icon, then:
- In Finder, click
Cmd + shift + Gand enter~Library/Preferences/Praat Prefs. This will open your Praat Preferences folder where plugins live. - Drag the autovot_plugin folder into your Praat Prefs folder.
- Note that test data for the tutorial and log files will live in this folder whenever you run the Praat plugin.
Quick-start: Bring me to the tutorial
Back to top
<a name="commandlinesetup"/>Command line installation
Please note:
- For a quick-start, skip to the tutorial section below after compiling.
- All commands in this readme should be executed from the command line on a Unix-style system (OS X or Linux).
- All commands for AutoVOT Version 0.91 have been tested on OS X Mavericks only.
- Any feedback is greatly appreciated!
AutoVOT is available to be cloned from Github, which allows you to easily have access to any future updates.
The code to clone AutoVOT is:
$ git clone https://github.com/mlml/autovot.git
When updates become available, you may navigate to the directory and run:
$ git pull origin master
If you are new to Github, check out the following site for helpful tutorials and tips for getting set up:
https://help.github.com/articles/set-up-git
Alternatively, you can download the current version of AutoVOT as a zip file, in which case you will not have access to future updates without re-downloading the updated version.
Compiling
Note: While you only have to clean and compile once, you will have to add the path to code to your experiments path every time you open a new terminal window.
Clean and compile from the code directory:
$ cd autovot/autovot/code
$ make clean
If successful, the final line of the output should be:
[make] Cleaning completed
Then, run:
$ make
Final line of the output should be:
[make] Compiling completed
Finally, add the path to code to your experiments path:
If not working out of the given experiments directory, you must add the path to your intended working directory.
IMPORTANT: YOU MUST ADD THE PATH EVERY TIME YOU OPEN A NEW TERMINAL WINDOW
$ cd ../../experiments
$ export PATH=$PATH:/[YOUR PATH HERE]/autovot/autovot/bin
For example:
$ export PATH=$PATH:/Users/mcgillLing/3_MLML/autovot/autovot/bin
Quick-start: Bring me to the tutorial
Back to top
<a name="outofthebox"/>Out of the box:
Files included in this version:
-
AutoVOT scripts:
autovot/contains all scripts necessary for user to extract features, train, and decode VOT measurements. -
Tutorial example data:
experiments/data/tutorialExample/contains the .wav and .TextGrid files used for training and testing, as well asmakeConfigFiles.sh, a helper script used to generate file lists.- Note: This data contains short utterances with one VOT window per file. Future versions will contain examples with longer files and more instances of VOT per file.
- The TextGrids contain 3 tiers, one of which will be used by autovot. The tiers are
phones,words, andvot. Thevottier contains manually aligned VOT intervals that are labeled "vot"
-
Example classifiers:
experiments/models/contains three pre-trained classifiers that the user may use if they do not wish to provide their own training data. All example classifiers were used in Sonderegger & Keshet (2012) and correspond to the Big Brother and PGWords datasets in that paper:- Big Brother:
bb_jasa.classifier's are trained on conversational British speech. Word-initial voiceless stops were included in training. This classifier is best to use if working with conversational speech - PGWords:
nattalia_jasa.classifieris trained on single-word productions from lab speech: L1 American English and L2 English/L1 Portuguese bilinguals. Word-initial voiceless stops were included in training. This classifier is best to use if working with lab speech. - Note: For best performance the authors recommend hand-labeling a small subset of VOTs (~100 tokens) from your own data and training new classifiers (see information on training below). Experiments suggesting this works better than using a classifier pre-trained on another dataset are given in Sonderegger & Keshet (2012).
- Big Brother:
User provided files and directories
Important: Input TextGrids will be overwritten. If you wish to access your original files, be sure to back them up elsewhere.
Sound file format
- Wav files sampled at 16kHz mono
-
You can convert wav files using a utility such as SoX, as follows:
$ sox input.wav -c 1 -r 16000 output.wav
-
TextGrid file format
- Saved as text files with .TextGrid extension
- TextGrids for training must contain a tier with hand measured vot intervals. These intervals must have a common text label, such as "vot".
- TextGrids for testing must contain a tier with window intervals indicating the range of times where the algorithm should look for the VOT onset. These intervals must also have a common label, such as "window". For best performance the window intervals should:
- contain no more than one stop consonant
- contain about 50 msec before the beginning of the burst or
- if only force-aligned segments are available (each corresponding to an entire stop), contain about 30 msec before the beginning of the segment.
Directory format
The experiments folder contains subdirectories that will be used to store files generated by the scripts, in addition to data to be used during the working tutorial.
(See example data & experiment folders.)
experiments/config/: Currently empty: This is where lists of file names will be stored.experiments/models/: Currently contains example classifiers. This is also where your own classifiers will eventually be stored.experiments/tmp_dir/: Currently empty. This is where feature extraction will be stored in Mode 2.experiments/data/tutorialExample/: This contains TextGrids and wav files for training and testing during the tutorial.
Back to top
<a name="usage"/>Usage
Back to top
Tutorial to follow
AutoVOT allows for two modes of feature extraction:
- Mode 1 - Covert feature extraction: The handling of feature extraction is hidden. When training a classifier sing these features, a cross-validation set can be specified, or a random 20% of the training data will be used. The output consists of modified TextGrids with a tier containing VOT prediction intervals.
- Mode 2 - Features extracted to a known directory: Training and decoding is done after feature extraction. Features are extracted to a known directory once after which traini
