Bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Generate Convert Improve

Install / Use

/learn @google-deepmind/Bsuite

About this skill

Quality Score

0/100

README

Behaviour Suite for Reinforcement Learning (`bsuite`)

PyPI Python version

radar plot

Introduction

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent with two main objectives.

To collect clear, informative and scalable problems that capture key issues in the design of efficient and general learning algorithms.
To study agent behavior through their performance on these shared benchmarks.

This library automates evaluation and analysis of any agent on these benchmarks. It serves to facilitate reproducible, and accessible, research on the core issues in RL, and ultimately the design of superior learning algorithms.

Going forward, we hope to incorporate more excellent experiments from the research community, and commit to a periodic review of the experiments from a committee of prominent researchers.

For a more comprehensive overview, see the accompanying [paper].

Technical overview

bsuite is a collection of experiments, defined in the [experiments] subdirectory. Each subdirectory corresponds to one experiment and contains:

A file defining an RL environment, which may be configurable to provide different levels of difficulty or different random seeds (for example).
A sequence of keyword arguments for this environment, defined in the SETTINGS variable found in the experiment's sweep.py file.
A file analysis.py defining plots used in the provided Jupyter notebook.

bsuite works by logging results from "within" each environment, when loading environment via a load_and_record* function. This means any experiment will automatically output data in the correct format for analysis using the notebook, without any constraints on the structure of agents or algorithms.

We collate all of the results and analysis in a pre-made jupyter notebook bit.ly/bsuite-colab.

Getting started

If you are new to bsuite you can get started in our colab tutorial. This Jupyter notebook is hosted with a free cloud server, so you can start coding right away without installing anything on your machine. After this, you can follow the instructions below to get bsuite running on your local machine.

Installation

We have tested bsuite on Python 3.6 & 3.7. To install the dependencies:

Optional: We recommend using a Python virtual environment to manage your dependencies, so as not to clobber your system installation:
```
python3 -m venv bsuite
source bsuite/bin/activate
pip install --upgrade pip setuptools
```
Install bsuite directly from PyPI:
```
pip install bsuite
```
Optional: To also install dependencies for the [baselines] examples (excluding OpenAI and Dopamine examples), run:
```
pip install bsuite[baselines]
```

Environments

Complete descriptions of each environment and their corresponding experiments are found in the [analysis/results.ipynb] Jupyter notebook.

These environments all have small observation sizes, allowing for reasonable performance with a small network on a CPU.

Loading an environment

Environments are specified by a bsuite_id string, for example "deep_sea/7". This string denotes the experiment and the (index of the) environment settings to use, as described in the technical overview section.

For a full description of each environment and its corresponding experiment settings, see the [paper].

import bsuite

env = bsuite.load_from_id('catch/0')

The sequence of bsuite_ids required to run all experiments can be accessed programmatically via:

from bsuite import sweep

sweep.SWEEP

This module also contains bsuite_ids for each experiment individually via uppercase constants corresponding to the experiment name, for example:

sweep.DEEP_SEA
sweep.DISCOUNTING_CHAIN

In addition, sequences of bsuite_ids with the same tag can be loaded via:

from bsuite import sweep

sweep.TAGS

The TAGS variable groups bsuite environments together by their underlying tag, so all the basic tasks or scale tasks can be loaded with:

sweep.TAGS['basic']
sweep.TAGS['scale']

Loading an environment with logging included

We include one implementation of automatic logging, available via:

[bsuite.load_and_record_to_csv]. This outputs one CSV file per bsuite_id, so is suitable for running a set of bsuite experiments split over multiple machines. The implementation is in [logging/csv_logging.py]

Note, older versions of bsuite included an SQLite logger. If you would like to use this, please contact us and we can update and reinstate it.

We also include a terminal logger in [logging/terminal_logging.py], exposed via bsuite.load_and_record_to_terminal.

It is easy to write your own logging mechanism, if you need to save results to a different storage system. See the CSV implementation for the simplest reference.

Interacting with an environment

Our environments implement the Python interface defined in dm_env.

More specifically, all our environments accept a discrete, zero-based integer action (or equivalently, a scalar numpy array with shape ()).

To determine the number of actions for a specific environment, use

num_actions = env.action_spec().num_values

Each environment returns observations in the form of a numpy array.

We also expose a bsuite_num_episodes property for each environment in bsuite. This allows users to run exactly the number of episodes required for bsuite's analysis, which may vary between environments used in different experiments.

Example run loop for a hypothetical agent with a step() method.

for _ in range(env.bsuite_num_episodes):
  timestep = env.reset()
  while not timestep.last():
    action = agent.step(timestep)
    timestep = env.step(action)
  agent.step(timestep)

Using `bsuite` in 'OpenAI Gym' format

To use bsuite with a codebase that uses the OpenAI Gym interface, use the GymFromDMEnv class in [utils/gym_wrapper.py]:

import bsuite
from bsuite.utils import gym_wrapper

env = bsuite.load_and_record_to_csv('catch/0', results_dir='/path/to/results')
gym_env = gym_wrapper.GymFromDMEnv(env)

Note that bsuite does not include Gym in its default dependencies, so you may need to pip install it separately.

Baseline agents

We include implementations of several common agents in the [baselines/] subdirectory, along with a minimal run-loop.

See the installation section for how to include the required dependencies at install time. These dependencies are not installed by default, since bsuite does not require users to use any specific machine learning library.

Running the entire suite of experiments

Each of the agents in the baselines folder contains a run script which serves as an example which can run against a single environment or against the entire suite of experiments, by passing the --bsuite_id=SWEEP flags; this will start a pool of processes with which to run as many experiments in parallel as the host machine allows. On a 12 core machine, this will complete overnight for most agents. Alternatively, it is possible to run on Google Compute Platform using run_on_gcp.sh, steps of which are outlined below.

Running experiments on Google Cloud Platform

run_on_gcp.sh does the following in order:

Create an instance with specified specs (by default 64-core CPU optimized).
git clones bsuite and installs it together with other dependencies.
Runs the specified agent (currently limited to /baselines) on a specified environment.
Copies the resulting SQLite file to /tmp/bsuite.db from the remote instance to you local machine.
Shuts down the created instance.

In order to run the script, you first need to create a billing account. Then follow the instructions here to setup and initialize Cloud SDK. After completing gcloud init, you are ready to run bsuite on Google Cloud.

For this make run_on_gcp.sh executable and run it:

chmod +x run_on_gcp.sh
./run_on_gcp.sh

After the instance is created, the instance name will be printed. Then you can ssh into the instance by selecting Compute Engine -> Instances and clicking SSH. Note that this is not necessary, as the result will be copied to your local machine once it is ready. However, sshing might be convenient if you want to make local changes to agent and environments. In this case, after sshing, do

~/bsuite_env/bin/activate

to activate the virtual environment. Then you can run agents via

python ~/bsuite/bsuite/baselines/dqn/run.py --bsuite_id=SWEEP

for instance.

Analysis

bsuite comes with a ready-made analysis Jupyter notebook included in [analysis/results.ipynb]. This notebook loads and processes logged data, and produces the scores and plots for each experiment. We recommend using this notebook in conjunction with Colaboratory.

We provide an example of a such bsuite report here.

`bsuite` Report

You can use bsuite to generate an automated 1-page

Related Skills

clearshot

Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.

openpencil

2.0k

The world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.

ui-ux-designer

Use this agent when you need to design, implement, or improve user interface components and user experience flows. Examples include: creating new pages or components, improving existing UI layouts, implementing responsive designs, optimizing user interactions, building forms or dashboards, analyzing existing UI through browser snapshots, or when you need to ensure UI components follow design system standards and shadcn/ui best practices.\n\n<example>\nContext: User needs to create a new dashboard page for team management.\nuser: "I need to create a team management dashboard where users can view team members, invite new members, and manage roles"\nassistant: "I'll use the ui-ux-designer agent to design and implement this dashboard with proper UX considerations, using shadcn/ui components and our design system tokens."\n</example>\n\n<example>\nContext: User wants to improve the user experience of an existing form.\nuser: "The signup form feels clunky and users are dropping off. Can you improve it?"\nassistant: "Let me use the ui-ux-designer agent to analyze the current form UX and implement improvements using our design system and shadcn/ui components."\n</example>\n\n<example>\nContext: User wants to evaluate and improve existing UI.\nuser: "Can you take a look at our pricing page and see how we can make it more appealing and user-friendly?"\nassistant: "I'll use the ui-ux-designer agent to take a snapshot of the current pricing page, analyze the UX against Notion-inspired design principles, and implement improvements using our design tokens."\n</example>

HappyColorBlend

HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to

google-deepmind

View profile

View on GitHub

GitHub Stars1.5k

CategoryDesign

Updated1d ago

Forks185

google-deepmind/bsuite

Languages

Python

Security Score

95/100

Audited on Apr 1, 2026

No findings