iGibson Challenge 2021 @ CVPR2021 Embodied AI Workshop

This repository contains starter code for iGibson Challenge 2021 brought to you by Stanford Vision and Learning Lab and Robotics @ Google. For an overview of the challenge, visit the challenge website. For an overview of the workshop, visit the workshop website.

Tasks

The iGibson Challenge 2021 uses the iGibson simulator [1] and is composed of two navigation tasks that represent important skills for autonomous visual navigation:

Interactive Navigation | Social Navigation :-------------------------:|:-------------------------: <img src="images/cvpr21_interactive_nav.png" height="400"> | <img src="images/cvpr21_social_nav.png" height="400">

Interactive Navigation: the agent is required to reach a navigation goal specified by a coordinate (as in PointNav [2]) given visual information (RGB+D images). The agent is allowed (or even encouraged) to collide and interact with the environment in order to push obstacles away to clear the path. Note that all objects in our scenes are assigned realistic physical weight and fully interactable. However, as in the real world, while some objects are light and movable by the robot, others are not. Along with the furniture objects originally in the scenes, we also add additional objects (e.g. shoes and toys) from the Google Scanned Objects dataset to simulate real-world clutter. We will use Interactive Navigation Score (INS) [3] to evaluate agents' performance in this task.
Social Navigation: the agent is required to navigate the goal specified by a coordinate while moving around pedestrians in the environment. Pedestrians in the scene move towards randomly sampled locations, and their movement is simulated using the social-forces model ORCA [4] integrated in iGibson [1], similar to the simulation enviroments in [5]. The agent shall avoid collisions or proximity to pedestrians beyond a threshold (distance <0.3 meter) to avoid episode termination. It should also maintain a comfortable distance to pedestrians (distance <0.5 meter), beyond which the score is penalized but episodes are not terminated. We will use the average of STL (Success weighted by Time Length) and PSC (Personal Space Compliance) to evaluate the agents' performance. More details can be found in the "Evaluation Metrics" section below.

Evaluation Metrics

Interactive Navigation: We will use Interactive Navigation Score (INS) as our evaluation metrics. INS is an average of Path Efficiency and Effort Efficiency. Path Efficiency is equivalent to SPL (Success weighted by Shortest Path). Effort Efficiency captures both the excess of displaced mass (kinematic effort) and applied force (dynamic effort) for interaction with objects. We argue that the agent needs to strike a healthy balance between taking a shorter path to the goal and causing less disturbance to the environment. More details can be found in our paper.
Social Navigation: We will use the average of STL (Success weighted by Time Length) and PSC (Personal Space Compliance) as our evaluation metrics. STL is computed by success * (time_spent_by_ORCA_agent / time_spent_by_robot_agent). The second term is the number of timesteps that an oracle ORCA agent take to reach the same goal assigned to the robot. This value is clipped by 1. In the context of Social Navigation, we argue STL is more applicable than SPL because a robot agent can achieve perfect SPL by "waiting out" all pedestrians before it makes a move, which defeats the purpose of the task. PSC (Personal Space Compliance) is computed as the percentage of timesteps that the robot agent comply with the pedestrians' personal space (distance >= 0.5 meter). We argue that the agent needs to strike a heathy balance between taking a shorted time to reach the goal and incuring less personal space violation to the pedestrians.

Dataset

We provide 8 scenes reconstructed from real world apartments in total for training in iGibson. All objects in the scenes are assigned realistic weight and fully interactable. For interactive navigation, we also provide 20 additional small objects (e.g. shoes and toys) from the Google Scanned Objects dataset. For fairness, please only use these scenes and objects for training.

For evaluation, we have 2 unseen scenes in our dev split and 5 unseen scenes in our test split. We also use 10 unseen small objects (they will share the same object categories as the 20 training small objects, but they will be different object instances).

Visualizations for the 8 training scenes.

alt text

Setup

We adopt the following task setup:

Observation: (1) Goal position relative to the robot in polar coordinates, (2) current linear and angular velocities, (3) RGB+D images.
Action: Desired normalized linear and angular velocity.
Reward: We provide some basic reward functions for reaching goal and making progress. Feel free to create your own.
Termination conditions: The episode termintes after 500 timesteps or the robot collides with any pedestrian in the Social Nav task.

The tech spec for the robot and the camera sensor can be found in here.

For Interactive Navigation, we place N additional small objects (e.g. toys, shoes) near the robot's shortest path to the goal (N is proportional to the path length). These objects are generally physically lighter than the objects originally in the scenes (e.g. tables, chairs).

For Social Navigation, we place M pedestrians randomly in the scenes that pursue their own random goals during the episode while respecting each other's personal space (M is proportional to the physical size of the scene). The pedestrians have the same maximum speed as the robot. They are aware of the robot so they won't walk straight into the robot. However, they also won't yield to the robot: if the robot moves straight towards the pedestrians, it will hit them and the episode will fail.

Participation Guidelines

Participate in the contest by registering on the EvalAI challenge page and creating a team. Participants will upload docker containers with their agents that evaluated on a AWS GPU-enabled instance. Before pushing the submissions for remote evaluation, participants should test the submission docker locally to make sure it is working. Instructions for training, local evaluation, and online submission are provided below.

Local Evaluation

Step 1: Clone the challenge repository

git clone https://github.com/StanfordVL/iGibsonChallenge2021.git
cd iGibsonChallenge2021

Three example agents are provided in simple_agent.py and rl_agent.py: RandomAgent, ForwardOnlyAgent, and SACAgent.

Here is the RandomAgent defined in simple_agent.py.

ACTION_DIM = 2
LINEAR_VEL_DIM = 0
ANGULAR_VEL_DIM = 1


class RandomAgent:
    def __init__(self):
        pass

    def reset(self):
        pass

    def act(self, observations):
        action = np.random.uniform(low=-1, high=1, size=(ACTION_DIM,))
        return action

Please, implement your own agent and instantiate it from agent.py.

Step 2: Install nvidia-docker2, following the guide: https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0).
Step 3: Modify the provided Dockerfile to accommodate any dependencies. A minimal Dockerfile is shown below.
```
FROM gibsonchallenge/gibson_challenge_2021:latest
ENV PATH /miniconda/envs/gibson/bin:$PATH

ADD agent.py /agent.py
ADD simple_agent.py /simple_agent.py
ADD rl_agent.py /rl_agent.py

ADD submission.sh /submission.sh
WORKDIR /
```
Then build your docker container with docker build . -t my_submission , where my_submission is the docker image name you want to use.
Step 4:

Download challenge data by running ./download.sh and the data will be decompressed in gibson_challenge_data_2021.
Step 5:

Evaluate locally:

You can run ./test_minival_locally.sh --docker-name my_submission

If things work properly, you should be able to see the terminal output in the end:
```
...
Episode: 1/3
Episode: 2/3
Episode: 3/3
Avg success: 0.0
Avg stl: 0.0
Avg psc: 1.0
Avg episode_return: -0.6209138999323173
...
```
The script by default evaluates Social Navigation. If you want to evaluate Interactive Navigation, you need to change CONFIG_FILE, TASK and EPISODE_DIR in the script and make them consistent. It's recommended that you use TASK environment variable to switch agents in agent.py if you intend to use different policies for these two tasks.

Online submission

Follow instructions in the submit tab of the EvalAI challenge page to submit your docker image. Note that you will need a version of EvalAI >= 1.2.3. Here we reproduce part of those instructions for convenience:

# Installing EvalAI Command Line Interface
pip install "evalai>=1.2.3"

# Set EvalAI account token
evalai set_token <your EvalAI participant token>

# Push docker image to EvalAI docker registry
evalai push my_submission:latest --phase <phase-name>

The valid challenge phases are: igibson-minival-social-808, igibson-minival-interactive-808, igibson-dev-social-808, igibson-dev-interactive-808, igibson-test-social-808, igibson-test-interactive-808.

Our iGibson Challenge 2021 consists of four phases:

Minival Phase (igibson-minival-social-808, igibson-minival-interactive-808): The purpose of this phase to make sure your policy can be succe

IGibsonChallenge2021

Install / Use

README