SkillAgentSearch skills...

RLeXplore

RLeXplore provides stable baselines of exploration methods in reinforcement learning, such as intrinsic curiosity module (ICM), random network distillation (RND) and rewarding impact-driven exploration (RIDE).

Install / Use

/learn @RLE-Foundation/RLeXplore

README

<div align=center> <br> <img src='./assets/logo.png' style="width: 70%"> <br>

RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning

</div>

RLeXplore is a unified, highly-modularized and plug-and-play toolkit that currently provides high-quality and reliable implementations of eight representative intrinsic reward algorithms. It used to be challenging to compare intrinsic reward algorithms due to various confounding factors, including distinct implementations, optimization strategies, and evaluation methodologies. Therefore, RLeXplore is designed to provide unified and standardized procedures for constructing, computing, and optimizing intrinsic reward modules.

The workflow of RLeXplore is illustrated as follows:

<div align=center> <img src='./assets/workflow.png' style="width: 100%"> </div>

Table of Contents

Installation

  • with pip recommended

Open a terminal and install rllte with pip:

conda create -n rllte python=3.8
pip install rllte-core 
  • with git

Open a terminal and clone the repository from GitHub with git:

git clone https://github.com/RLE-Foundation/rllte.git
pip install -e .

Now you can invoke the intrinsic reward module by:

from rllte.xplore.reward import ICM, RIDE, ...

Module List

| Type | Modules | |--- |--- | | Count-based | PseudoCounts, RND, E3B | | Curiosity-driven | ICM, Disagreement, RIDE | | Memory-based | NGU | | Information theory-based | RE3 |

Tutorials

Click the following links to get the code notebook:

  1. Quick Start
  2. RLeXplore with RLLTE
  3. RLeXplore with Stable-Baselines3
  4. RLeXplore with CleanRL
  5. Exploring Hybrid Intrinsic Rewards
  6. Custom Intrinsic Rewards

Benchmark Results

We have published a space using Weights & Biases (W&B) to store reusable experiment results on recognized benchmarks. The space link is: RLeXplore's W&B Space.

<div align=center> <img src='./assets/wandb.png' style="width: 75%"> </div>
  • RLLTE's PPO+RLeXplore on SuperMarioBros:
<div align=center> <img src='./assets/smb.png' style="width: 100%"> </div>
  • RLLTE's PPO+RLeXplore on MiniGrid:

    • DoorKey-16×16
    <div align=center> <img src='./assets/mgd.png' style="width: 100%"> </div>
    • KeyCorridorS8R5, KeyCorridorS9R6, KeyCorridorS10R7, MultiRoom-N7-S8, MultiRoom-N10-S10, MultiRoom-N12-S10, Dynamic-Obstacles-16x16, and LockedRoom
    <div align=center> <img src='./assets/mg_hard.png' style="width: 100%"> </div>
  • RLLTE's PPO+RLeXplore on Procgen-Maze:

    • Number of levels=1
    <div align=center> <img src='./assets/procgen_1maze.png' style="width: 100%"> </div>
    • Number of levels=200
    <div align=center> <img src='./assets/procgen_allmaze.png' style="width: 100%"> </div>
  • RLLTE's PPO+RLeXplore on five hard-exploration tasks of ALE:

| Algorithm | Gravitar | MontezumaRevenge | PrivateEye | Seaquest | Venture | |:-------------:|:------------:|:--------------------:|:--------------:|:------------:|:-----------:| | Extrinsic | 1060.19 | 42.83 | 88.37 | 942.37 | 391.73 | | Disagreement | 689.12 | 0.00 | 33.23 | 6577.03 | 468.43 | | E3B | 503.43 | 0.50 | 66.23 | 8690.65 | 0.80 | | ICM | 194.71 | 31.14 | -27.50 | 2626.13 | 0.54 | | PseudoCounts | 295.49 | 0.00 | 1076.74 | 668.96 | 1.03 | | RE3 | 130.00 | 2.68 | 312.72 | 864.60 | 0.06 | | RIDE | 452.53 | 0.00 | -1.40 | 1024.39 | 404.81 | | RND | 835.57 | 160.22 | 45.85 | 5989.06 | 544.73 |

  • CleanRL's PPO+RLeXplore's RND on Montezuma's Revenge:
<div align=center> <img src='./assets/atari_curves.png' style="width: 70%"> </div>
  • RLLTE's SAC+RLeXplore on Ant-UMaze:
<div align=center> <img src='./assets/sac_ant.png' style="width: 70%"> </div>

Cite Us

To cite this repository in publications:

@article{yuan_roger2025rlexplore,
  title={RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning},
  author={Yuan, Mingqi and Castanyer, Roger Creus and Li, Bo and Jin, Xin and Berseth, Glen and Zeng, Wenjun},
  journal={Transactions on Machine Learning Research},
  issn={2835-8856},
  year={2025},
  url={https://openreview.net/forum?id=B9BHjTN4z6},
  note={}
}

Related Skills

View on GitHub
GitHub Stars459
CategoryEducation
Updated22h ago
Forks23

Languages

Jupyter Notebook

Security Score

100/100

Audited on Mar 30, 2026

No findings