Genrl
A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
Install / Use
/learn @SforAiDl/GenrlREADME
GenRL is a PyTorch reinforcement learning library centered around reproducible, generalizable algorithm implementations and improving accessibility in Reinforcement Learning
GenRL's current release is at v0.0.2. Expect breaking changes
Reinforcement learning research is moving faster than ever before. In order to keep up with the growing trend and ensure that RL research remains reproducible, GenRL aims to aid faster paper reproduction and benchmarking by providing the following main features:
- PyTorch-first: Modular, Extensible and Idiomatic Python
- Tutorials and Example: 20+ Tutorials from basic RL to SOTA Deep RL algorithm (with explanations)!
- Unified Trainer and Logging class: code reusability and high-level UI
- Ready-made algorithm implementations: ready-made implementations of popular RL algorithms.
- Faster Benchmarking: automated hyperparameter tuning, environment implementations etc.
By integrating these features into GenRL, we aim to eventually support any new algorithm implementation in less than 100 lines.
If you're interested in contributing, feel free to go through the issues and open PRs for code, docs, tests etc. In case of any questions, please check out the Contributing Guidelines
Installation
GenRL is compatible with Python 3.6 or later and also depends on pytorch and openai-gym. The easiest way to install GenRL is with pip, Python's preferred package installer.
$ pip install genrl
Note that GenRL is an active project and routinely publishes new releases. In order to upgrade GenRL to the latest version, use pip as follows.
$ pip install -U genrl
If you intend to install the latest unreleased version of the library (i.e from source), you can simply do:
$ git clone https://github.com/SforAiDl/genrl.git
$ cd genrl
$ python setup.py install
Usage
To train a Soft Actor-Critic model from scratch on the Pendulum-v0 gym environment and log rewards on tensorboard
import gym
from genrl.agents import SAC
from genrl.trainers import OffPolicyTrainer
from genrl.environments import VectorEnv
env = VectorEnv("Pendulum-v0")
agent = SAC('mlp', env)
trainer = OffPolicyTrainer(agent, env, log_mode=['stdout', 'tensorboard'])
trainer.train()
To train a Tabular Dyna-Q model from scratch on the FrozenLake-v0 gym environment and plot rewards:
import gym
from genrl.agents import QLearning
from genrl.trainers import ClassicalTrainer
env = gym.make("FrozenLake-v0")
agent = QLearning(env)
trainer = ClassicalTrainer(agent, env, mode="dyna", model="tabular", n_episodes=10000)
episode_rewards = trainer.train()
trainer.plot(episode_rewards)
Tutorials
Algorithms
Deep RL
- DQN (Deep Q Networks)
- DQN
- Double DQN
- Dueling DQN
- Noisy DQN
- Categorical DQN
- VPG (Vanilla Policy Gradients)
- A2C (Advantage Actor-Critic)
- PPO (Proximal Policy Optimization)
- DDPG (Deep Deterministic Policy Gradients)
- TD3 (Twin Delayed DDPG)
- SAC (Soft Actor Critic)
Classical RL
- SARSA
- Q Learning
Bandit RL
- Multi Armed Bandits
- Eps Greedy
- UCB
- Thompson Sampling
- Bayesian Bandits
- Softmax Explorer
- Contextual Bandits
- Eps Greedy
- UCB
- Thompson Sampling
- Bayesian Bandits
- Softmax Explorer
- Deep Contextual Bandits
- Variation Inference
- Noise sampling for neural network parameters
- Epsilon greedy with a neural network
- Bayesian Regression on for posterior inference
- Bootstraped Ensemble
Credits and Similar Libraries:
- Gym - Environments
- Ray
- OpenAI Baselines - Logger
- Stable Baselines 3: Stable Baselines aims to provide baselines for Deep RL Algorithms.
- pytorch-a2c-ppo-acktr
- Deep Contextual Bandits
Related Skills
claude-opus-4-5-migration
108.7kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
347.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
feishu-drive
347.9k|
things-mac
347.9kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
