Tinydreamer
An implementation of delta-iris in tinygrad
Install / Use
/learn @geohot/TinydreamerREADME
DreamerV3
https://arxiv.org/abs/2301.04104
Want to do well on Atari100k (pip install gym[atari] autorom[accept-rom-license]), though BSuite (pip install bsuite) looks interesting too.
This is designed to run on a tinybox, either red or green, with just ./train.py
Process
- Run https://github.com/danijar/dreamerv3 to train a model that plays Pong
- Get that model loaded into tinygrad and running, both the policy model and decoder
- Get fine tuning working
- Get full training working
delta-iris
Might be a better choice, the repo is a lot easier to read. https://github.com/vmicheli/delta-iris
Three models:
- actor_critic (two copies, model and target_model)
- world_model
- transformer takes in (frames_emb x1, act_tokens_emb x1, latents_emb x4) x many
- frame_cnn (FrameEncoder), output 4 channels
- tokenizer
- frame_cnn (FrameEncoder), output 16 channels
- encoder is 7 channels, 3 for prev_frame, 1 for action, and 3 for frame (FrameEncoder), output 64 channels for quantizer
- decoder is 84 channels, 16 for prev_frame, 4 for action, and 64 for latents. it outputs an image (FrameDecoder)
- quantizer
Training:
- Happens in three distinct phases
- First, tokenizer is trained. It outputs 4 (from a vocab of 1024, codebook dim of 64) tokens per delta image
- q = encoder(img_0, encoder_act_emb(a), img_1)
- decoder(frame_cnn(img_0), decoder_act_emb(a), q)
- Then, world model is trained
- transformer([frame_cnn(img_0), act_emb(a), latents_emb(tokens_from_encoder), ...])
- Last, actor critic is trained (in world model)
- First, tokenizer is trained. It outputs 4 (from a vocab of 1024, codebook dim of 64) tokens per delta image
Our training strategy is to reproduce each one in reverse.
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
