SkillAgentSearch skills...

RLs

Reinforcement Learning Algorithms Based on PyTorch

Install / Use

/learn @StepNeverStop/RLs

README

<div align="center"> <a href="https://github.com/StepNeverStop/RLs"> <img width="auto" height="200px" src="./pics/logo.png"> </a> <br/> <br/> <a href="https://github.com/StepNeverStop/RLs"> <img width="auto" height="20px" src="./pics/font.png"> </a> </div> <div align="center"> <p><strong>RLs:</strong> Reinforcement Learning Algorithm Based On PyTorch.</p> </div>

RLs

This project includes SOTA or classic reinforcement learning (single and multi-agent) algorithms used for training agents by interacting with Unity through ml-agents Release 18 or with gym.

About

The goal of this framework is to provide stable implementations of standard RL algorithms and simultaneously enable fast prototyping of new methods. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research).

Characteristics

This project supports:

  • Suitable for Windows, Linux, and OSX
  • Single- and Multi-Agent training.
  • Multiple type of observation sensors as input.
  • Only need 3 steps to implement a new algorithm:
    1. policy write .py in rls/algorithms/{single/multi} directory and make the policy inherit from super-class defined in rls/algorithms/base
    2. config write .yaml in rls/configs/algorithms/ directory and specify the super config type defined in rls/configs/algorithms/general.yaml
    3. register register new algorithm in rls/algorithms/__init__.py
  • Only need 3 steps to adapt to a new training environment:
    1. wrapper write environment wrappers in rls/envs/{new platform} directory and make it inherit from super-class defined in rls/envs/env_base.py
    2. config write default configuration in rls/configs/{new platform}
    3. register register new environment platform in rls/envs/__init__.py
  • Compatible with several environment platforms
    • Unity3D ml-agents.
    • PettingZoo
    • gym, for now only two data types are compatible——[Box, Discrete]. Support parallel training using gym envs, just need to specify --copies to how many agents you want to train in parallel.
      • environments:
      • observation -> action:
        • Discrete -> Discrete (observation type -> action type)
        • Discrete -> Box
        • Box -> Discrete
        • Box -> Box
        • Box/Discrete -> Tuple(Discrete, Discrete, Discrete)
  • Four types of Replay Buffer, Default is ER:
  • Noisy Net for better exploration.
  • Intrinsic Curiosity Module for almost all off-policy algorithms implemented.
  • Parallel training multiple scenes for Gym
  • Unified data format

Installation

method 1:

$ git clone https://github.com/StepNeverStop/RLs.git
$ cd RLs
$ conda create -n rls python=3.8
$ conda activate rls
# Windows
$ pip install -e .[windows]
# Linux or Mac OS
$ pip install -e .

method 1:

conda env create -f environment.yaml

If using ml-agents:

$ pip install -e .[unity]

You can download the builded docker image from here:

$ docker pull keavnn/rls:latest

If anyone who wants to send a PR, plz format all code-files first:

$ pip install -e .[pr]
$ python auto_format.py -d ./

Implemented Algorithms

For now, these algorithms are available:

| Algorithms | Discrete | Continuous | Image | RNN | Command parameter | | :-----------------------------: | :------: | :--------: | :---: | :--: | :---------------: | | PG | ✓ | ✓ | ✓ | ✓ | pg | | AC | ✓ | ✓ | ✓ | ✓ | ac | | A2C | ✓ | ✓ | ✓ | ✓ | a2c | | NPG | ✓ | ✓ | ✓ | ✓ | npg | | TRPO | ✓ | ✓ | ✓ | ✓ | trpo | | PPO | ✓ | ✓ | ✓ | ✓ | ppo | | DQN | ✓ | | ✓ | ✓ | dqn | | Double DQN | ✓ | | ✓ | ✓ | ddqn | | Dueling Double DQN | ✓ | | ✓ | ✓ | dddqn | | Averaged DQN | ✓ | | ✓ | ✓ | averaged_dqn | | Bootstrapped DQN | ✓ | | ✓ | ✓ | bootstrappeddqn | | Soft Q-Learning | ✓ | | ✓ | ✓ | sql | | C51 | ✓ | | ✓ | ✓ | c51 | | QR-DQN | ✓ | | ✓ | ✓ | qrdqn | | IQN | ✓ | | ✓ | ✓ | iqn | | Rainbow | ✓ | | ✓ | ✓ | rainbow | | DPG | ✓ | ✓ | ✓ | ✓ | dpg | | DDPG | ✓ | ✓ | ✓

Related Skills

View on GitHub
GitHub Stars452
CategoryEducation
Updated2mo ago
Forks93

Languages

Python

Security Score

100/100

Audited on Jan 16, 2026

No findings