Tinydreamer

An implementation of delta-iris in tinygrad

Generate Convert Improve

Install / Use

/learn @geohot/Tinydreamer

About this skill

Quality Score

0/100

README

DreamerV3

https://arxiv.org/abs/2301.04104

Want to do well on Atari100k (pip install gym[atari] autorom[accept-rom-license]), though BSuite (pip install bsuite) looks interesting too.

This is designed to run on a tinybox, either red or green, with just ./train.py

Process

Run https://github.com/danijar/dreamerv3 to train a model that plays Pong
Get that model loaded into tinygrad and running, both the policy model and decoder
Get fine tuning working
Get full training working

delta-iris

Might be a better choice, the repo is a lot easier to read. https://github.com/vmicheli/delta-iris

Three models:

actor_critic (two copies, model and target_model)
world_model
- transformer takes in (frames_emb x1, act_tokens_emb x1, latents_emb x4) x many
- frame_cnn (FrameEncoder), output 4 channels
tokenizer
- frame_cnn (FrameEncoder), output 16 channels
- encoder is 7 channels, 3 for prev_frame, 1 for action, and 3 for frame (FrameEncoder), output 64 channels for quantizer
- decoder is 84 channels, 16 for prev_frame, 4 for action, and 64 for latents. it outputs an image (FrameDecoder)
- quantizer

Training:

Happens in three distinct phases
- First, tokenizer is trained. It outputs 4 (from a vocab of 1024, codebook dim of 64) tokens per delta image
  - q = encoder(img_0, encoder_act_emb(a), img_1)
  - decoder(frame_cnn(img_0), decoder_act_emb(a), q)
- Then, world model is trained
  - transformer([frame_cnn(img_0), act_emb(a), latents_emb(tokens_from_encoder), ...])
- Last, actor critic is trained (in world model)

Our training strategy is to reproduce each one in reverse.

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。