Please Note

Intel has created an excellent annotation tool with the latest technologies. https://github.com/opencv/cvat

The project below is archived, and no further updates are expected.

2009 to 2020

VATIC - Video Annotation Tool from Irvine, California

VATIC is an online video annotation tool for computer vision research that crowdsources work to Amazon's Mechanical Turk. Our tool makes it easy to build massive, affordable video data sets.

INSTALLATION

Note: VATIC has only been tested on Ubuntu with Apache 2.2 HTTP server and a MySQL server. This document will describe installation on this platform, however it should work any operating system and with any server.

Download

You can download and extract VATIC from our website. Note: do NOT run the installer as root.

$ wget http://mit.edu/vondrick/vatic/vatic-install.sh
$ chmod +x vatic-install.sh
$ ./vatic-install.sh
$ cd vatic

HTTP Server Configuration

Open the Apache configuration file. On Ubuntu, this file is located at:

/etc/apache2/sites-enabled/000-default

If you do not use Apache on this computer for any other purpose, replace the contents of the file with:

WSGIDaemonProcess www-data
WSGIProcessGroup www-data

<VirtualHost *:80>
    ServerName vatic.domain.edu
    DocumentRoot /path/to/vatic/public

    WSGIScriptAlias /server /path/to/vatic/server.py
    CustomLog /var/log/apache2/access.log combined
</VirtualHost>

updating ServerName with your domain name, DocumentRoot with the path to the public directory in VATIC, and WSGIScriptAlias to VATIC's server.py file.

If you do use Apache for other purposes, you will have to setup a new virtual host with the correct document root and script alias, as shown above.

Make sure you have the mod_headers module enabled:

$ sudo cp /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled

After making these changes, restart Apache:

$ sudo apache2ctl graceful

SQL Server Configuration

We recommend creating a separate database specifically for VATIC:

$ mysql -u root
mysql> create database vatic;

The next section will automatically create the necessary tables.

Setup

Inside the vatic directory, copy config.py-example to config.py:

$ cp config.py-example config.py

Then open config.py and make changes to the following variables in order to configure VATIC:

signature       Amazon Mechanical Turk AWS signature (secret access key)
accesskey       Amazon Mechanical Turk AWS access key (access key ID)
sandbox         If true, put into Mturk sandbox mode. For debugging.
localhost       The local HTTP address: http://vatic.domain.edu/ so it
                matches the ServerName in Apache.
database        Database connection string: for example,
                mysql://user:pass@localhost/vatic
geolocation     API key from ipinfodb.com for geolocation services

If you do not plan on using VATIC on Mechcanical Turk (offlien mode only), you can leave the signature and accesskey empty.

After saving results, you can then initialize the database:

$ turkic setup --database

Note: if you want to reset the database, you can do this with:

$ turkic setup --database --reset

which will require confirmation to reset in order to prevent data loss.

Finally, you must also allow VATIC to access turkic, a major dependency:

$ turkic setup --public-symlink

ANNOTATION

Before you continue, you should verify that the installation was correct. You can verify this with:

$ turkic status --verify

If you receive any error messages, it means the installation was not complete and you should review the previous section. Note: If you do not plan on using Mechanical Turk, you can safely ignore any errors caused by Mechanical Turk.

Frame Extraction

Our system requires that videos are extracted into JPEG frames. Our tool can do this automatically for you:

$ mkdir /path/to/output/directory
$ turkic extract /path/to/video.mp4 /path/to/output/directory

By default, our tool will resize the frames to fit within a 720x480 rectangle. We believe this resolution is ideal for online video viewing. You can change resolution with options:

$ turkic extract /path/to/video.mp4 /path/to/output/directory
  --width 1000 --height 1000

$ turkic extract /path/to/video.mp4 /path/to/output/directory
  --no-resize

The tool will maintain aspect ratio in all cases.

Alternatively, if you have already extracted frames, you can use the formatframes command to format the video into a format that VATIC understands:

$ turkic formatframes /path/to/frames/ /path/to/output/directory

The above command will read all the images in /path/to/frames and create hard links (soft copy) in /path/to/output/directory.

Importing a Video

After extracting frames, the video can be imported into our tool for annotation. The general syntax for this operation is:

$ turkic load identifier /path/to/output/directory Label1 Label2 LabelN

where identifier is a unique string that you will use to refer to this video, /path/to/output/directory is the directory of frames, and LabelX are class labels that you want annotated (e.g., Person, Car, Bicycle). You can have as many class labels as you wish, but you must have at least one.

When a video is imported, it is broken into small segments typically of only a few seconds. When all the segments are annotated, the annotations are merged across segments because each segment overlaps another by a small margin.

The above command specifies all of the required options, but there are many options available as well. We recommend using these options.

MTurk Options
    --title         The title that MTurk workers see
    --description   The description that MTurk workers see
    --duration      Time in seconds that a worker has to complete the task
    --lifetime      Time in seconds that the task is online
    --keywords      Keywords that MTurk workers can search on
    --offline       Disable MTurk and use for self annotation only

Compensation Options
    --cost                  The price advertised to MTurk workers
    --per-object-bonus      A bonus in dollars paid for each object
    --completion-bonus      A bonus in dollars paid for completing the task

Qualification Options
    --min-approved-percent  Minimum percent of tasks the worker must have
                            approved before they can work for you
    --min-approved-amount   Minimum number of tasks that the worker must 
                            have completed before they can work for you

Video Options
    --length        The length of each segment for this video in frames
    --overlap       The overlap between segments in frames
    --use-frames    When splitting into segments, only the frame intervals
                    specified in this file. Each line should contain a
                    start frame, followed by a space, then the stop frame.
                    Frames outside the intervals in this file will be
                    ignored.
    --skip          If specified, request annotations only every N frames.
    --blow-radius   When a user marks an annotation, blow away all other
                    annotations within this many frames. If you want to
                    allow the user to make fine-grained annotations, set
                    this number to a small integer, or 0 to disable. By
                    default, this is 5, which we recommend.

You can also specify temporal attributes that each object label can take on. For example, you may have a person object with attributes "walking", "running", or "sitting". You can specify attributes the same way as labels, except you prepend an ~ before the text, which bind the attribute to the previous label:

$ turkic load identifier /path/to/output/directory Label1 ~Attr1A ~Attr1B
  Label2 ~Attr2A ~Attr2B ~Attr2C Label3

In the above example, Label1 will have attributes Attr1A and Attr1B, Label2 will have attributes Attr2B, Attr2B, and Attr2C and Label3 will have no attributes. Specifying attributes is optional.

Gold Standard Training

It turns out that video annotation is extremely challenging and most MTurk workers lack the necessary patience. For this reason, we recommend requiring workers to pass a "gold standard" video. When a new worker visits the task, they will be redirected to a video for which the annotations are already known. In order to move on to the true annotations, the worker must correctly annotate the gold standard video first. We have found that this approach significantly improves the quality of the annotations.

To use this feature, import a video to be used as the gold standard:

$ turkic load identifier-train /path/to/frames Label1 Label2 LabelN
  --for-training --for-training-start 0 --for-training-stop 500
  --for-training-overlap 0.5 --for-training-tolerance 0.1
  --for-training-mistakes 1

You can also use any of the options described above. Explanations for the new options are as follows:

--for-training              Specifies that this video is gold standard
--for-training-start        Specifies the first frame to use
--for-training-stop         Specifies the last frame to use
--for-training-overlap      Percent overlap that worker's boxes must match 
--for-training-tolerance    Percent that annotations must agree temporally
--for-training-mistakes     The number of completely wrong annotations 
                            allowed. We recommend setting this to a small,
                            nonzero integer.

After running the above command, it will provide you with an URL for you to input the ground truth annotation. You must make this g

Vatic

Install / Use

README