Rllabplusplus
No description available
Install / Use
/learn @rlbayes/RllabplusplusREADME
rllab++
rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:
The codes are experimental, and may require tuning or modifications to reach the best reported performances.
Installation
Please follow the basic installation instructions in rllab documentation.
Examples
From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:
python algo_gym_stub.py --exp=<exp_name>
Flags include:
- algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), etc. See launcher_utils.py for more variants.
- env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.
The experiment will be saved in /data/local/<exp_name>.
Citations
If you use rllab++ for academic research, you are highly encouraged to cite the following papers:
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schoelkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". arXiv:1706.00387 [cs.LG], 2017.
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic" Proceedings of the International Conference on Learning Representations (ICLR), 2017.
- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
