FCSRL
Feasibility Consistent Representation Learning for Safe Reinforcement Learning (ICML 2024). Current SOTA model-free safe RL algorithm on safety-gymnasium
Install / Use
/learn @czp16/FCSRLREADME
Overview
This repo provides an official implementation of Feasibility Consistent Representation Learning for Safe Reinforcement Learning (ICML 2024). In this paper, we propose a feasibility-based representation learning method to extract safety-related features and improve the safe reinforcement learning. Our method FCSRL+TD3-Lag achieves SOTA (2024/02) performance on safety-gymnasium benchmark among model-free safe RL algorithms. See paper for more details.
<p align="center"> <img src="assets/framework.png" alt="framework" width=60% > </p>Installation
- We recommend to use Anaconda or Miniconda to manage python environment.
- Create conda env,
cd FCSRL conda env create -f environment.yaml conda activate FCSRL - Install PyTorch according to your platform and cuda version.
- Install FCSRL,
pip install -e .
Training
To run a single experiment, take PointGoal1 for example, run
python scripts/{BASE_RL_ALG}_repr_CMDP.py --env_name SafetyPointGoal1Gymnasium-v0 --cudaid 0 --seed 100
where {BASE_RL_ALG} can be ppo or td3. For other task, you can simply replace PointGoal1 and choose a task from [PointGoal1, PointButton1, PointPush1, PointGoal2, CarGoal1, CarButton1]. You can replace --cudaid 0 to --cudaid -1 to train with CPU.
For image-based task,
python scripts/td3_repr_vision_CMDP.py --env_name SafetyPointGoal2Gymnasium-v0 --cudaid 0 --seed 100
If you need to train and render with CPU, you should modify the environment variable to
os.environ["MUJOCO_GL"] = "osmesa"
os.environ["PYOPENGL_PLATFORM"] = "osmesa"
in script. However, it can be very low if you train without GPU on image-based tasks.
Citation
If you find our work helpful, please cite:
@article{cen2024feasibility,
title={Feasibility Consistent Representation Learning for Safe Reinforcement Learning},
author={Cen, Zhepeng and Yao, Yihang and Liu, Zuxin and Zhao, Ding},
journal={arXiv preprint arXiv:2405.11718},
year={2024}
}
Acknowledgement
This repo is partly based on Tianshou.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
