DRL

Deep Recurrent Q-Network with different exploration strategies for self-driving cars (using AirSim)

Generate Convert Improve

Install / Use

/learn @ValentinaZangirolami/DRL

About this skill

Quality Score

0/100

README

Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning

Description

This repo contains an implementation of Double Dueling Deep Recurrent Q-Network which can be enhanced with several exploration strategies, like deterministic epsilon-greedy, adaptive epsilon-greedy (VDBE and BMC) [1], softmax, max-boltzmann exploration and VDBE-softmax, and an error masking strategy [2], [4].

Code Structure:

<code>./AirsimEnv/</code>: folder where the two environments ( <code>AirsimEnv.py</code> and <code>AirsimEnv_9actions.py</code> ) are stored; the former includes five steering angles and the latter nine steering angles. Further, this folder contains:
- <code>DRQN_classes.py</code>: implementation of agent, experience replay, exploration strategies, neural network and connection with AirSim NH are defined
- <code>bayesian.py</code>: a support for BMC epsilon-greedy
- <code>final_reward_points.csv</code>: a support for reward calculation (required for env scripts)
<code>DRQN_airsim_training.py</code>: contains training loop in which all files in the previous points are required (main script for training process)
<code>DRQN_evaluation.py</code>: contains training and test evaluation; each subset is defined with a different set of starting points to evaluate the model performance
The new implementation in Tensorflow 2.x is now available. You can check the implementation of all exploration strategies in the previous version while see updates of the neural network in the new code.

Prerequisites

Python 3.7.6
Tensorflow 2.5.0
Tornado 4.5.3
OpenCV 4.5.2.54
OpenAI Gym 0.18.3
Airsim 1.5.0

Hardware

2 GPU Tesla M60 with 8 Gb

References

[1] Gimelfarb, M., S. Sanner, and C.-G. Lee, 2020: ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning. CoRR

[2] Juliani A., 2016: Simple Reinforcement Learning with Tensorflow Part 6: Partial Observability and Deep Recurrent Q-Networks. URL: https://github.com/awjuliani/DeepRL-Agents

[3] Riboni, A., A. Candelieri, and M. Borrotti, 2021: Deep Autonomous Agents comparison for Self-Driving Cars. Proceedings of The 7th International Conference on Machine Learning, Optimization and Big Data - LOD

[4] Welcome to AirSim, https://microsoft.github.io/AirSim/

How to cite

Zangirolami, V. and M. Borrotti, 2024: Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning. In: Knowledge-Based Systems 293. Paper

Acknowledgements

I acknowledge Data Science Lab of Department of Economics, Management and Statistics (DEMS) of University of Milan-Bicocca for providing a virtual machine.

DEMO

<video src="https://user-images.githubusercontent.com/78240304/149147549-29936bd7-f629-4b66-a125-ddcd50443bcb.mp4">.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding