Rainbow
Rainbow: Combining Improvements in Deep Reinforcement Learning
Install / Use
/learn @Kaixhin/RainbowREADME
Rainbow
Rainbow: Combining Improvements in Deep Reinforcement Learning [1].
Results and pretrained models can be found in the releases.
- [x] DQN [2]
- [x] Double DQN [3]
- [x] Prioritised Experience Replay [4]
- [x] Dueling Network Architecture [5]
- [x] Multi-step Returns [6]
- [x] Distributional RL [7]
- [x] Noisy Nets [8]
Run the original Rainbow with the default arguments:
python main.py
Data-efficient Rainbow [9] can be run using the following options (note that the "unbounded" memory is implemented here in practice by manually setting the memory capacity to be the same as the maximum number of timesteps):
python main.py --target-update 2000 \
--T-max 100000 \
--learn-start 1600 \
--memory-capacity 100000 \
--replay-frequency 1 \
--multi-step 20 \
--architecture data-efficient \
--hidden-size 256 \
--learning-rate 0.0001 \
--evaluation-interval 10000
Note that pretrained models from the 1.3 release used a (slightly) incorrect network architecture. To use these, change the padding in the first convolutional layer from 0 to 1 (DeepMind uses "valid" (no) padding).
Requirements
To install all dependencies with Anaconda run conda env create -f environment.yml and use source activate rainbow to activate the environment.
Available Atari games can be found in the atari-py ROMs folder.
Acknowledgements
- @floringogianu for categorical-dqn
- @jvmancuso for Noisy layer
- @jaara for AI-blog
- @openai for Baselines
- @mtthss for implementation details
References
[1] Rainbow: Combining Improvements in Deep Reinforcement Learning
[2] Playing Atari with Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Prioritized Experience Replay
[5] Dueling Network Architectures for Deep Reinforcement Learning
[6] Reinforcement Learning: An Introduction
[7] A Distributional Perspective on Reinforcement Learning
[8] Noisy Networks for Exploration
[9] When to Use Parametric Models in Reinforcement Learning?
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
19.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
