TCRMDP
Official implementation of Time-Constrained Robust MDP, NeurIPS 2024
Install / Use
/learn @SuReLI/TCRMDPREADME
TCRMDP
Official implementation of Time-Constrained Robust MDP, NeurIPS 2024.
Installation
All the environment is available in the RRLS package. To install the package, run the following commands:
git clone https://github.com/SuReLI/RRLS
cd RRLS
pip install -e .
Then go to TCRM folder and install the requirements:
cd TCRMDP
pip install -r requirements.txt
How to run the code
You have multiple entry points to run the code. One per algorithm. For example, to run the Stack TC m2t3d algorithm, you can run the following command:
python main_stacked_tc_m2td3.py --help
Codebase
.
└── src
├── evaluation.py # Evaluation script
├── __init__.py
├── m2td3 # This a the official implementation of M2TD3 from tanabe et al. 2022
│ ├── agent_wrapper.py
│ ├── algo.py
│ ├── factory.py
│ ├── __init__.py
│ ├── trainer.py
│ └── utils.py
├── main_dr.py # Entry point for the DR algorithm
├── main_m2td3.py # Entry point for the M2TD3 algorithm
├── main_oracle_tc_m2td3.py # Entry point for the Oracle TC M2TD3 algorithm
├── main_oracle_tc_rarl.py # Entry point for the Oracle TC RARL algorithm
├── main_rarl.py # Entry point for the RARL algorithm
├── main_stacked_tc_m2td3.py # Entry point for the Stacked TC M2TD3 algorithm
├── main_stacked_tc_rarl.py # Entry point for the Stacked TC RARL algorithm
├── main_tc_adversary.py # Entry point for train a TC adversary on a trained agent
├── main_vanilla.py # Entry point for the TD3 algorithm
├── main_vanilla_tc_m2td3.py # Entry point for the TC M2TD3 algorithm
├── main_vanilla_tc_rarl.py # Entry point for the TC RARL algorithm
├── mock_agent.py
├── scheduler.py # Fixed adversary scheduler in the paper
├── tc_mdp.py # This is where the magic happens, the implementation of the TCRMDP
├── td3
│ ├── buffer.py
│ ├── __init__.py
│ ├── models.py
│ ├── td3.py
│ └── trainer.py
└── utils.py
Citation
@inproceedings{
zouitine2024timeconstrained,
title={Time-Constrained Robust {MDP}s},
author={Adil Zouitine and David Bertoin and Pierre Clavier and Matthieu Geist and Emmanuel Rachelson},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=NKpPnb3YNg}
}
Related Skills
debug-log-analyzer-mcp
10MCP server for AI-powered Salesforce Apex debug log analysis. Find performance bottlenecks, slow methods, SOQL bottlenecks, and governor limit issues.
ditto-sales-enablement
2Claude Code skill: Generate a complete sales enablement kit (battlecard, objection guide, quote bank, one-pager, pitch narrative, ROI framework, demo script) from a single Ditto research study.
heroku-agentforce-mcp
3This repository has 4 different MCP projects that demonstrates some of the inner workings of the MCP and architectural patterns when integrating with various Agents as well as Agentforce.
