VaST
An implementation of Variational State Tabulation, from the paper here: https://arxiv.org/abs/1802.04325.
Install / Use
/learn @danecor/VaSTREADME
Variational State Tabulation
An implementation of Variational State Tabulation, using Python, TensorFlow and Cython. Based on the paper here: https://arxiv.org/abs/1802.04325.
Prerequisites
You should have a working installation of TensorFlow (https://www.tensorflow.org/install/). The following should include all required python modules:
pip install -r requirements.txt
Installation
You will need to install the Cython submodule and cythonize several modules:
git submodule init
git submodule update
chmod +x table/cypridict/install.sh
./table/cypridict/install.sh
cython table/hamming.pyx
cython table/ctable.pyx
Testing
Run
pytest
to run all unit tests (on the priority queue, the prioritized sweeping algorithm and the replay memory).
Example Run
Use the command
CUDA_VISIBLE_DEVICES=0 python run.py doom tmaze --num_steps=500000 --burnin=10000 --epsilon_period=40000
to run the experiment shown in Fig. 6 of the paper. By default, 125 minibatches are loaded at once onto the GPU and a separate thread is used to queue the minibatches for training the network. I recommend running only one job on each GPU (here, on Device 0) to avoid possible concurrency issues.
All of the output data will be written to tensorboard, which you can view with
tensorboard --logdir=doom/data/
Author
- Dane Corneil - [EPFL]
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Acknowledgments
- The TensorFlow code was originally based in part on Jan Hendrik Metzen's VAE implementation
- The Atari environment wrapper is based on the implementation in Atari-DeepRL by Nathan Sprague
- The Cython priority queue module is forked from Nan Jiang's Cyheap tutorial
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
