GenieRedux
A framework for training world models with virtual environments, complete with annotated environment dataset (RetroAct), exploration agent (AutoExplore Agent), and GenieRedux-G - an implementation of Genie with enhancements
Install / Use
/learn @insait-institute/GenieReduxREADME
GenieRedux
This is the official repository of <b>Exploration-Driven Generative Interactive Environments, CVPR'25</b>.
<!-- [](https://huggingface.co/INSAIT-Institute/GenieRedux) -->Authors: Nedko Savov, Naser Kazemi, Mohammad Mahdi, Danda Pani Paudel, Xi Wang, Luc Van Gool
<!-- [](https://nsavov.github.io/GenieRedux/) [](https://arxiv.org/pdf/2409.06445) [](https://huggingface.co/INSAIT-Institute/GenieRedux) Authors: [Naser Kazemi](https://naser-kazemi.github.io/)\*, [Nedko Savov](https://insait.ai/nedko-savov/)\*, [Danda Pani Paudel](https://insait.ai/dr-danda-paudel/), [Luc Van Gool](https://insait.ai/prof-luc-van-gool/) --> <!-- Keywords: Genie, Genie world model, Generative Interactive Environments, Genie Implementation, Open Source, RL exploration, world models, virtual environments, data-drive simulator. This repository contains a Pytorch open-source implementation of the Genie world model (Bruce et. al.) by Google DeepMind, as well as a novel framework for training world models on cheap interaction data from virtual environments. --> <!--   --> </div> <div style="width: 100%; max-width: 800px; margin: auto;"> <!-- Flex container: two children, each 50% of the wrapper --> <div style="display: flex; gap: 0.5em;"> <img src="docs/genieredux-g-pretraining-replicate.gif" alt="Replicate" style="flex: 1 1 50%; height: auto; width: auto; max-width: 40.0%;" /> <img src="docs/genieredux-g-explore.gif" alt="Explore" style="flex: 1 1 50%; height: auto; width: auto; max-width: 60.0%;" /> </div> </div>We present a framework for training multi-environment world models spanning hundreds of environments with different visuals and actions. Our training is cost-effective, as we make use of automatic collection from virtual environments instead of hand-curated datasets of human demonstrations. It consists of 3 components:
- <b>RetroAct</b> - a dataset of 974 annotated retro game environments - behavior, camera view, motion axis and controls
- <b> GenieRedux-G </b> - a multi-environment transformer world model, adapted for virtual environments and an enhanced version of GenieRedux - our open version of the Genie world model (Bruce et. al.).
- <b> AutoExplore Agent</b> - an exploration agent that explores environments entirely based on the dynamics prediction uncertainty of GenieRedux, escaping the need for an environment-specific reward and providing diverse training data for our world model.
In our latest work, we demonstrate our method on many platformer environments, obtained from our annotated dataset. We provide the training and evaluation code.
🚧 The complete codebase has been released. We are working on preparing and providing trained checkpoint files.
⚠️ For a minimal case study with the Coinrun environment (as described here), where both GenieRedux and GenieRedux-G are demonstrated, with pretrained weights and with an option for a trained agent, please refer to the neurips branch.
Installation
<b>Prerequisites:</b>
- Ensure you have Conda installed on your system. You can download and install Conda from the official website.
-
<b>Clone the repository.</b>
git clone https://github.com/insait-institute/GenieRedux.git cd GenieRedux -
<b>Install environments.</b>
bash install.shInstalls 3 conda environments:
retro_datagen- data generation environment with support for Stable-Retro.genie_redux- environment for training and evaluation of GenieRedux modelsauto_explore- environment for training and evaluation of AutoExplore Agent models.
In addition, our modified Agent57 repository is set up and our pretrained Agent57 models - downloaded
⚠️ You need to obtain and import the game ROMs in Stable-Retro. To do so, please follow instructions at the Stable-Retro Docs.
Note: This implementation is tested on Linux-64 with Python 3.13 and Conda package manager.
Quickstart
Initial Data Generation
To generate all initial datasets (saved in data_generation/datasets/), run:
conda activate retro_datagen
python run.py generate config=retro_act/pretrain
python run.py generate config=retro_act/control
python run.py generate config=retro_act/control_test
Training GenieRedux
Before we start, we set up the environment:
conda activate genie_redux
Tokenizer
To train the tokenizer for on the generated dataset (for 150k iterations), run:
python run.py genie_redux train config=tokenizer.yaml train.num_processes=6 train.batch_size=7 train.grad_accum=2
GenieRedux-G (Dynamics Only) Pretraining
In our paper, we pretrain a model, conditioned on ground truth actions, on 200 platformers:
python run.py genie_redux train config=genie_redux_guided_pretrain.yaml train.num_processes=7 train.batch_size=4 train.grad_accum=3
tokenizer_fpath=checkpoints/tokenizer/tokenizer/model-150000.pt
If you have more resources, we advise pretraining on all platformers by adding the parameter train.n_envs=0. To account for more environments, also set a higher value for train.num_train_steps.
GenieRedux-G-50
Finetuning on 50 control-aligned environments:
python run.py genie_redux train config=genie_redux_guided_50 train.num_processes=7 train.batch_size=4 train.grad_accum=3 model_fpath=checkpoints/genie_redux_guided/genie_redux_guided_pretrain/model-180000.pt
(Optional) GenieRedux (Dynamics+LAM)
Having the trained tokenizer, we can now train GenieRedux:
python run.py genie_redux train config=genie_redux train.num_processes=7 train.batch_size=3 train.grad_accum=4 tokenizer_fpath=checkpoints/tokenizer/tokenizer/model-150000.pt
Evaluating GenieRedux
To get quantitative evaluation (ΔPSNR, FID, PSNR, SSIM):
python run.py genie_redux eval config=genie_redux_guided_50 eval.action_to_take=-1 eval.model_fpath=checkpoints/genie_redux_guided/genie_redux_guided/model-100000.pt eval.inference_method=one_go
Training AutoExplore Agent
conda activate auto_explore
Launch AutoExplore training via the unified launcher using the auto_explore stack. Lightning Fabric handles device setup internally.
python run.py auto_explore train common.root_dpath=checkpoints/auto_explore world_model.root_dpath=checkpoints/genie_redux_guided world_model.model_dname=genie_redux_guided world_model.model_fname=model-100000.pt collection.games='["SuperMarioBros-Nes"]'
Notes:
world_model.*points the agent to a pretrained Genie/GenieRedux checkpoint (directory name + filename).collection.gamesselects the game. Example games to try: <i>AdventureIslandII-Nes, SuperMarioBros-Nes, Flintstones-Genesis, TinyToonAdventuresBustersHiddenTreasure-Genesis, BronkieTheBronchiasaurus-Snes, BugsBunnyBirthdayBlowout-Nes</i>- Outputs (checkpoints, configs, media) are written under
common.root_dpath/<run_name>.
Evaluating AutoExplore Agent
conda activate auto_explore
Launch evaluation via the unified launcher using the auto_explore stack.
python run.py auto_explore eval world_model.root_dpath=checkpoints/genie_redux_guided world_model.model_dname=genie_redux_guided world_model.model_fname=model-100000.pt common.root_dpath=checkpoints/auto_explore common.resume_id=1 common.resume_ckpt_id=model_best_reward collection.games='["SuperMarioBros-Nes"]'
Notes:
common.resume_idspecifies which training run to evaluate (matches the numbered model directory).common.resume_ckpt_idselects the checkpoint file within that run (e.g.model_best_reward).collection.gamesselects the game (the same as in training should be used)- Evaluation runs a single very long episode per epoch, computes average return, and generates gifs, stored on wandb and in <run_dir>/outputs/gifs/.
AutoExplore Data Generation
To generate a dataset with the trained AutoExplore Agent (saved in data_generation/datasets/), run:
conda activate retro_datagen
python run.py generate config=retro_act/auto_explore_ai2 config.connector.agent.checkpoint_fpath=`realpath checkpoints/auto_explore/001_auto_explore/checkpoints/model_best_reward.pt`
Finetuning GenieRedux on AutoExplore Data
First, the tokenizer is finetuned:
conda activate genie_redux
python run.py genie_redux train config=tokenizer_ft_smb.yaml train.num_processes=6 train.batch_size=7 train.grad_accum=2 tokenizer_fpath=checkpoints/tokenizer/tokenizer/model-150000.pt
Then, given the finetuned tokenizer, the dynamics is finetuned
python run.py genie_redux train config=genie_redux_guided_ft_smb train.num_processes=7 train.batch_size=4 train.grad_accum=3 model_fpath=checkpoints/genie_redux_guided/genie_redux_guided/model-100000.pt tokenizer_fpath=checkpoints/tokenizer/tokenizer_ft_smb/model-150000.pt
Evaluating the finetuned GenieRedux-G with Agent57
First, generate a dataset with our pretrained weights:
conda activate retro_datagen
py

