Rllabplusplus

No description available

Generate Convert Improve

Install / Use

/learn @rlbayes/Rllabplusplus

About this skill

Quality Score

0/100

README

rllab++

rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:

The codes are experimental, and may require tuning or modifications to reach the best reported performances.

Installation

Please follow the basic installation instructions in rllab documentation.

Examples

From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:

python algo_gym_stub.py --exp=<exp_name>

Flags include:

algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), etc. See launcher_utils.py for more variants.
env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.

The experiment will be saved in /data/local/<exp_name>.

Citations

If you use rllab++ for academic research, you are highly encouraged to cite the following papers:

Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schoelkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". arXiv:1706.00387 [cs.LG], 2017.
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic" Proceedings of the International Conference on Learning Representations (ICLR), 2017.
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.

Related Skills

node-connect

342.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

85.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

342.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

342.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。