<img src="img/deepracer.png?raw=true" height="70">

About

This README provides an overview of how our team approached the University of Sydney's 2020 AWS DeepRacer competition. This was a competition run by the School of Computer Science which provided teams with AWS credits to develop and train a DeepRacer model. Over the course of the model's development it was necessary to define an action space, develop a reward function for reinforcement learning, and experiment with various hyperparameters controlling the underlying 3-layer neural network.

Team

About
Results
Development
- Qualifier Model
- Finals Model
Conclusion
Acknowledgments

Results

USYD 2020 Finals (1st Place)

Track - Circuit de Barcelona-Catalunya

USYD 2020 Qualifier (1st Place)

Track - 2019 DeepRacer Championship Cup

Development

Qualifier Model

Defining the action space

The qualifier track was the 2019 DeepRacer Championship Cup track, which is a relatively straightforward loop with minor turns. We chose an action space with as few actions as possible (to reduce training time) while maintaining what we believed to be necessary actions to complete the track at speed. We chose a maximum speed of 3 m/s as a result of trial and error racing similar models with 2 and 4 m/s maximum speeds. A slower speed of 1.5 m/s was also chosen, allowing the vehicle to achieve intermediate speeds by switching between the two. As the turns are relatively smooth on this track, we limited the steering to 20 degrees, but still found it useful to include an intermediate steering angle for smaller corrections.

Developing the reward function

Initially, we trained the model on the much simpler Oval and Bowtie tracks using a centreline-following reward function with an incentive for faster speeds while travelling straight.

The sub-rewards can be seen in this code snippet from reward_simple.py:

  # Strongly discourage going off track
  if not all_wheels_on_track or is_offtrack:
      reward = 1e-3
      return float(reward)

  # Give higher reward if the car is closer to centre line and vice versa
  # 0 if you're on edge of track, 1 if you're centre of track
  reward = 1 - distance_from_center/(track_width/2)

  # Reward going faster when the car isn't turning
  if abs(steering_angle) < STEERING_THRESHOLD and speed > SPEED_THRESHOLD:
      reward += speed/SPEED_MAX

We chose to add sub-rewards rather than multiply them, based on the experience of Daniel Gonzalez shared in "An Advanced Guide to AWS DeepRacer".

We realised that a linear incentive for staying near the centre of the track would be limiting for the vehicle when it would be faster to "cut" the curvature of a turn. So the linear centreline sub-reward was replaced by a quadratic one, which meant the reward was less sensitive to small movements away from the centreline:

# Give higher reward if the car is closer to centre line and vice versa
# 0 if you're on edge of track, 1 if you're centre of track
reward = 1 - (distance_from_center/(track_width/2))**2

An additional sub-reward was also included to encourage the vehicle to progress through the track faster relative to the number of steps taken (note the step-rate is constant at 15 Hz).

# Reward progress
reward += progress/steps

Once the model was demonstrating a basic ability to follow the simple tracks, we moved onto the 2019 DeepRacer Championship Cup track.

A noticeable sticking point that the model ran into was an inability to take the North-West corner at high speeds (note this track is traversed anti-clockwise). Often it would approach the turn too quickly and be unable to position itself appropriately in time to take the turn successfully, an issue which we occasionally observed on other turns as well. To address this, we implemented a method of detecting corners ahead of the vehicle using waypoint information and incentivised going slower in response to future corners.

def identify_corner(waypoints, closest_waypoints, future_step):

    # Identify next waypoint and a further waypoint
    point_prev = waypoints[closest_waypoints[0]]
    point_next = waypoints[closest_waypoints[1]]
    point_future = waypoints[min(len(waypoints)-1,closest_waypoints[1]+future_step)]

    # Calculate headings to waypoints
    heading_current = math.degrees(math.atan2(point_prev[1]-point_next[1], point_prev[0] - point_next[0]))
    heading_future = math.degrees(math.atan2(point_prev[1]-point_future[1], point_prev[0]-point_future[0]))

    # Calculate the difference between the headings
    diff_heading = abs(heading_current-heading_future)

    # Check we didn't choose the reflex angle
    if diff_heading > 180:
        diff_heading = 360 - diff_heading

    # Calculate distance to further waypoint
    dist_future = np.linalg.norm([point_next[0]-point_future[0],point_next[1]-point_future[1]])  

    return diff_heading, dist_future

The identify_corner() function was used to identify whether a corner existed between the car and a specified waypoint in the future. However, the spacing of waypoints is not consistent, so searching a constant number of waypoints ahead for a corner risked causing the car to slow down unnecessarily if the corner was actually still far away. To mitigate this, after identifying a corner a check was implemented to determine if it is within a minimum distance of the car. If not, the function would be called again for a closer waypoint. We only ran this additional check to determine if an identified corner is so far away that there is still a straight portion of the track between the car and the corner. Due to our choice of parameters and the layout of this track, we found that if the identify_corner() function indicated that the track ahead was straight, the track between the car and the waypoint which was evaluated would generally also be straight even if the waypoints are spaced far apart.

def select_speed(waypoints, closest_waypoints, future_step, mid_step):

    # Identify if a corner is in the future
    diff_heading, dist_future = identify_corner(waypoints, closest_waypoints, future_step)

    if diff_heading < TURN_THRESHOLD:
        # If there's no corner encourage going faster
        go_fast = True
    else:
        if dist_future < DIST_THRESHOLD:
            # If there is a corner and it's close encourage going slower
            go_fast = False
        else:
            # If the corner is far away, re-assess closer points
            diff_heading_mid, dist_mid = identify_corner(waypoints, closest_waypoints, mid_step)

            if diff_heading_mid < TURN_THRESHOLD:
                # If there's no corner encourage going faster
                go_fast = True
            else:
                # If there is a corner and it's close encourage going slower
                go_fast = False

    return go_fast

# Implement speed incentive
go_fast = select_speed(waypoints, closest_waypoints, FUTURE_STEP, MID_STEP)

if go_fast and speed > SPEED_THRESHOLD:
    reward += 0.5

elif not go_fast and speed < SPEED_THRESHOLD:
    reward += 0.5

These functions refer to various parameters which affect when the car is incentivised to go faster or slower. To determine what the best values of these were, it was useful to visualise their effect. Using the track data provided by the Autonomous Race Car Community's waypoint-visualization git repository, and again taking inspiration from the Advanced Guide to AWS DeepRacer article, we developed our own visualisation tool (qualifier_planner.py) which identifies regions of the track where our reward_qualifier.py function would reward the car for going faster or slower.

The points labelled "Bonus Fast" show the effect of the additional distance check implemented in the select_speed() function discussed earlier (i.e. points which would have been marked "Slow" if the distance check was not incorporated). The actual reward function does not differentiate between "Fast" and "Bonus Fast" regions.

DeepRacer

Install / Use

README