Flexrl

Non-modular implementation of common RL algorithms

Generate Convert Improve

Install / Use

/learn @alexchen-buaa/Flexrl

About this skill

Quality Score

0/100

README

FlexRL

FlexRL is a deep online/offline reinforcement learning library inspired and adapted from CleanRL and CORL that provides single-file implementations of algorithms that aren't necessarily covered by these libraries. FlexRL introduces the following features:

Consistent style across online and offline algorithms
Easy configuration with Pyrallis and tqdm progress bar
A few custom environments under gym API

Quick Start

Installing FlexRL

git clone https://github.com/alexchen-buaa/flexrl.git
cd flexrl
pip install -e .

Usage

Run the algorithms as individual scripts. Like CORL, we use Pyrallis for configuration management. The arguments can be specified using command-line arguments, a yaml file, or both:

python ppo.py --config_path=some_config.yaml

Algorithms Implemented

| Type | Algorithm | Variants Implemented | | -------- | ---------------------------------- | -------------------------------------------------------------- | | Online | Proximal Policy Optimization (PPO) | ppo.py | | | | ppo_atari.py | | | | ppo_multidiscrete.py | | | Deep Q-Networks (DQN) | dqn.py | | | | dqn_atari.py | | | Quantile-Regression DQN (QR-DQN) | qr_dqn.py | | | | qr_dqn_atari.py | | | Soft Actor-Critic (SAC) | sac.py | | Offline | Implicit Q-Learning (IQL) | iql.py | | | | iql_jax.py | | | In-Sample Actor-Critic (InAC) | inac.py | | | | inac_jax.py | | | Soft Actor-Critic Ensemble (SAC-N) | sac_n_jax.py |

Extra Requirements

Atari/ALE

According to The Arcade Learning Environment, you can use the command line tool to import your ROMS:

ale-import-roms roms/

MuJoCo

To use MuJoCo envs (for both online training and offline evaluation), you need to install MuJoCo first. See mujoco-py for instructions.

JAX with CUDA Support

To use JAX with CUDA support, you need to install the NVIDIA driver first. See JAX Installation for instructions.

References

[1] S. Huang, R. F. J. Dossa, C. Ye, and J. Braga, “CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms.” arXiv, Nov. 16, 2021. Accessed: Nov. 21, 2022. [Online]. Available: http://arxiv.org/abs/2111.08819
[2] Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann, “Stable-Baselines3: Reliable Reinforcement Learning Implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021.
[3] W. Dabney, M. Rowland, M. G. Bellemare, and R. Munos, “Distributional Reinforcement Learning with Quantile Regression,” arXiv:1710.10044 [cs, stat], Oct. 2017, Accessed: Apr. 15, 2022. [Online]. Available: http://arxiv.org/abs/1710.10044
[4] I. Kostrikov, A. Nair, and S. Levine, “Offline Reinforcement Learning with Implicit Q-Learning.” arXiv, Oct. 12, 2021. Accessed: Mar. 29, 2023. [Online]. Available: http://arxiv.org/abs/2110.06169
[5] C. Xiao, H. Wang, Y. Pan, A. White, and M. White, “The In-Sample Softmax for Offline Reinforcement Learning.” arXiv, Feb. 28, 2023. Accessed: Apr. 02, 2023. [Online]. Available: http://arxiv.org/abs/2302.14372

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

alexchen-buaa

View profile

View on GitHub

GitHub Stars4

CategoryEducation

Updated6mo ago

Forks0

alexchen-buaa/flexrl

Languages

Python

Security Score

67/100

Audited on Sep 26, 2025

No findings