IPLAN
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
Install / Use
/learn @wuxiyang1996/IPLANREADME
This repository is the codebase for our paper.
This repository was originally forked from https://github.com/oxwhirl/pymarl and https://github.com/carolinewang01/dm2. The MAPPO baseline comes from https://github.com/uoe-agents/epymarl. The GAT-RNN structure is adapted from the G2ANet implementation made by https://github.com/starry-sky6688/MARL-Algorithms. The idea of the instant incentive inference module is adapted from https://github.com/usaywook/gin.
About
- Multi-agent Reinforcement Learning
- Autonoumous Driving
- Representation Learning
Table of Contents
- About
- Dependencies
- Installation
- Running iPLAN
- Ablation Study
- IPPO, IPPO-BM, IPPO-GAT, iPLAN-Hard, iPLAN-FC
- Baselines
- Helper Functions
- Compute Navigation Metrics
- Generate Animation
- Plot Reward Curve
- Results
- Animation
- Acknowledgement
- Citation
Dependencies
- PyTorch (1.13.1 + cu116) (GPU)
- stable-baselines3
- Heterogeneous_Highway_Env (Forked from highway-env)
- scared
- PyYAML
Note: Please our modified Highway-env given in Heterogeneous_Highway_Env
as there are major changes from the initial version of highway-env. Also,
Multi-agent Particles used in our repo are different. Please use
the code given in envs/mpe folder.
Installation
First, install dependencies
pip install stable-baselines3[extra] pyyaml sacred gym tensorboard_logger
Then install our forked version of Highway-env
pip install Heterogeneous_Highway_Env
Finally, install iPLAN package
pip install iPLAN
Running iPLAN
In the configuration file config/default.yaml, set up environments needed for your experiment:
- Set environment
env:MPEfor Non-cooperative Navigation andhighwayfor Heterogeneous Highway - Set difficulty level
difficulty:easyfor easy (Non-cooperative Navigation) or mild (Heterogeneous Highway) scenario.hardfor hard (Non-cooperative Navigation) or chaotic (Heterogeneous Highway) scenario.
- Set
Behavior_enable: True - Set
GAT_enable: TrueandGAT_use_behavior: True - Set
soft_update_enable: Trueandbehavior_fully_connected: False - Run
python3 main.py
Results, including printed logs, saved models and tensorboard logger, are stored in the folder results
Ablation Study
When running experiments for ablation study, please only change the hyperparameters mentioned
in the configuration file config/default.yaml
and keep those the same as they are in the iPLAN experiment.
Running IPPO
- Set
Behavior_enable: False - Set
GAT_enable: FalseandGAT_use_behavior: False - Run
python3 main.py
Running IPPO-BM
- Set
Behavior_enable: True - Set
GAT_enable: FalseandGAT_use_behavior: False - Run
python3 main.py
Running IPPO-GAT
- Set
Behavior_enable: False - Set
GAT_enable: TrueandGAT_use_behavior: True - Run
python3 main.py
Running iPLAN-Hard
- Set
soft_update_enable: False - Run
python3 main.py
Running iPLAN-FC
- Set
behavior_fully_connected: True - Run
python3 main.py
Baselines
Baselines used in this paper could be found in the baselinesfolder, where the organization of files is
similar to the main directory of iPLAN. Please change the environment setting in config/default.yaml
before experiments. No extra changes need.
Helper Functions
Notebooks for helper function are given in helper folder.
Please follow the instructions below:
Compute Navigation Metrics
(Only for Heterogeneous Highway)
In the configuration file config/default.yaml
- Set
metrics_enable: True - Set
num_test_episodeslarger thanbatch_size_run - Run
python3 main.py
Then you will get printed navigation metrics after the execution logs of each episode.
To compute the navigation metrics, use the notebook helper/RL_results_metrics.ipynb to compute averaged navigation
metrics from the printed log file (usually given in results/sacred).
Generate Animation
(Only for Heterogeneous Highway)
In the configuration file config/default.yaml
- Set
animation_enable: True - (Recommended) Set
metrics_enable: True, Setnum_test_episodeslarger thanbatch_size_run - Run
main.py
Screenshots of the Heterogeneous Highway are stored in the animation folder. Use the notebook
helper/Gif_helper.ipynb to generate animation from screenshots.
Plot Reward Curve
The printed log file are usually given in results/sacred.
- Choose the log file you want to recover, use the notebook
helper/RL_results_repack.ipynbto convert the log file into.csvfile. - Use the notebook
RL Visualization Helper - Highway.ipynb(Heterogeneous Highway) orRL Visualization Helper - MPE.ipynb(Non-cooperative Navigation) to plot the reward curve from the generated.csvfiles for each approaches and scenarios.
Results
<p align="center"> <img src="figs/MPE_comb.png"><br/> <em> Non-Cooperative Navigation: with 3 agents in the (left) easy and (right) hard scenarios. 50 steps/episode. </em> </p> <p align="center"> <img src="figs/Hetero_comb.png"><br/> <em> Heterogeneous Highway: with 5 agents in (left) mild and (right) chaotic scenarios. 90 steps/episode. </em> </p>Animation
We visually compare the performance of iPLAN with QMIX and MAPPO. Each baseline is tested with multiple learning agents shown in green, and each animation shows 5 such learning agents from their respective viewpoints.
<p align="center"> <img src="animation/iPLAN_Hetero_E_5_90.0_23.95.gif"><br/> <em> iPLAN in mild (easy) scenario of Heterogeneous Highway (Num of agents succeed: 5, Avg. survival time: 90, Avg. speed: 23.95).</em> </p> <p align="center"> <img src="animation/iPLAN_Hetero_H_5_90.0_21.81.gif"><br/> <em> iPLAN in chaotic (hard) scenario of Heterogeneous Highway (Num of agents succeed: 5, Avg. survival time: 90, Avg. speed: 21.81).</em> </p> <p align="center"> <img src="animation/MAPPO_Hetero_E_2_49.6_28.44.gif"><br/> <em> MAPPO in mild (easy) scenario of Heterogeneous Highway (Num of agents succeed: 2, Avg. survival time: 49.6, Avg. speed: 28.44).</em> </p> <p align="center"> <img src="animation/MAPPO_Hetero_H_2_54.0_28.66.gif"><br/> <em> MAPPO in chaotic (hard) scenario of Heterogeneous Highway (Num of agents succeed: 2, Avg. survival time: 54.0, Avg. speed: 28.44).</em> </p> <p align="center"> <img src="animation/QMIX_Hetero_E_4_72.6_21.2.gif"><br/> <em> QMIX in mild (easy) scenario of Heterogeneous Highway (Num of agents succeed: 4, Avg. survival time: 72.6, Avg. speed: 21.2).</em> </p> <p align="center"> <img src="animation/QMIX_Hetero_H_3_67.8_24.9.gif"><br/> <em> QMIX in chaotic (hard) scenario of Heterogeneous Highway (Num of agents succeed: 3, Avg. survival time: 67.8, Avg. speed: 24.9).</em> </p>Acknowledgement
We thank Haoxiang Zhao (@zhx0506) and other community members' effort in maintaining this repo!
Citation
@inproceedings{wu2023intent,
title={Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning},
author={Wu, Xiyang and Chandra, Rohan and Guan, Tianrui and Bedi, Amrit and Manocha, Dinesh},
booktitle={7th Annual Conference on Robot Learning},
year={2023}
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
