Dreamerv3
Mastering Diverse Domains through World Models
Install / Use
/learn @danijar/Dreamerv3README
Mastering Diverse Domains through World Models
A reimplementation of DreamerV3, a scalable and general reinforcement learning algorithm that masters a wide range of applications with fixed hyperparameters.

If you find this code useful, please reference in your paper:
@article{hafner2025dreamerv3,
title={Mastering diverse control tasks through world models},
author={Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy},
journal={Nature},
pages={1--7},
year={2025},
publisher={Nature Publishing Group}
}
To learn more:
DreamerV3
DreamerV3 learns a world model from experiences and uses it to train an actor critic policy from imagined trajectories. The world model encodes sensory inputs into categorical representations and predicts future representations and rewards given actions.

DreamerV3 masters a wide range of domains with a fixed set of hyperparameters, outperforming specialized methods. Removing the need for tuning reduces the amount of expert knowledge and computational resources needed to apply reinforcement learning.
Due to its robustness, DreamerV3 shows favorable scaling properties. Notably, using larger models consistently increases not only its final performance but also its data-efficiency. Increasing the number of gradient steps further increases data efficiency.

Instructions
The code has been tested on Linux and Mac and requires Python 3.11+.
Docker
You can either use the provided Dockerfile that contains instructions or
follow the manual instructions below.
Manual
Install JAX and then the other dependencies:
pip install -U -r requirements.txt
Training script:
python dreamerv3/main.py \
--logdir ~/logdir/dreamer/{timestamp} \
--configs crafter \
--run.train_ratio 32
To reproduce results, train on the desired task using the corresponding config,
such as --configs atari --task atari_pong.
View results:
pip install -U scope
python -m scope.viewer --basedir ~/logdir --port 8000
Scalar metrics are also writting as JSONL files.
Tips
- All config options are listed in
dreamerv3/configs.yamland you can override them as flags from the command line. - The
debugconfig block reduces the network size, batch size, duration between logs, and so on for fast debugging (but does not learn a good model). - By default, the code tries to run on GPU. You can switch to CPU or TPU using
the
--jax.platform cpuflag. - You can use multiple config blocks that will override defaults in the
order they are specified, for example
--configs crafter size50m. - By default, metrics are printed to the terminal, appended to a JSON lines file, and written as Scope summaries. Other outputs like WandB and TensorBoard can be enabled in the training script.
- If you get a
Too many leaves for PyTreeDeferror, it means you're reloading a checkpoint that is not compatible with the current config. This often happens when reusing an old logdir by accident. - If you are getting CUDA errors, scroll up because the cause is often just an
error that happened earlier, such as out of memory or incompatible JAX and
CUDA versions. Try
--batch_size 1to rule out an out of memory error. - Many environments are included, some of which require installing additional
packages. See the
Dockerfilefor reference. - To continue stopped training runs, simply run the same command line again and
make sure that the
--logdirpoints to the same directory.
Disclaimer
This repository contains a reimplementation of DreamerV3 based on the open source DreamerV2 code base. It is unrelated to Google or DeepMind. The implementation has been tested to reproduce the official results on a range of environments.
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
17.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
