Fastpbrl

Vectorization techniques for fast population-based training.

Generate Convert Improve

Install / Use

/learn @instadeepai/Fastpbrl

About this skill

Quality Score

0/100

README

Fast Population-Based Reinforcement Learning

This repository contains the code for the paper "Fast Population-Based Reinforcement Learning on a Single Machine paper from InstaDeep", (Flajolet et al., 2022) :computer::zap:.

First-time setup

Install Docker

This code requires docker to run. To install docker please follow the online instructions here. To enable the code to run on GPU, please install Nvidia-docker (as well as the latest nvidia driver available for your GPU).

Build and run a docker image

Once docker and docker Nvidia are installed, you can simply build the docker image with the following command:

make build

and, once the image is built, start the container with:

make dev_container

Inside the container, you can run the nvidia-smi command to verify that your GPU is found.

Run preconfigured scripts

Replicate the experiments from the paper

We provide scripts and commands to replicate the experiments discussed in the paper. All these commands are defined in the Makefile at the root of the repository.

To replicate the experiments corresponding to Figure 2 (where we measure the runtime of a population-wide update step with various implementations), run:

make run_timing_sactd3
make run_timing_dqn

To replicate the experiments discussed in Section 5 (which correspond to full training runs), run the following:

make run_td3_cemrl
make run_td3_dvd
make run_td3_pbt
make run_sac_pbt

Note that dvd training runs are unstable and sometimes crash early on due to NaNs.

We use tensorboard to log metrics during the training run. The tensorboard command to run to visualize them is printed when the experiment starts.

Launch a test script

Run the following command to start a short test which validates that the code in the training scripts is working as expected.

make test_training_scripts

Contributors

Citing this work

If you use the code or data in this package, please cite:

@inproceedings{flajolet2022fast,
  title={Fast Population-Based Reinforcement Learning on a Single Machine},
  author={Flajolet, Arthur and Monroc, Claire Bizon and Beguir, Karim and Pierrot, Thomas},
  booktitle={International Conference on Machine Learning},
  pages={6533--6547},
  year={2022},
  organization={PMLR}
}

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

18.8k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

sec-edgar-agentkit

AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.

instadeepai

View profile

View on GitHub

GitHub Stars57

CategoryEducation

Updated3mo ago

Forks3

instadeepai/fastpbrl

Languages

Python

Security Score

100/100

Audited on Jan 6, 2026

No findings