SkillAgentSearch skills...

Dapo

Source code for the paper "Divergence-Augmented Policy Optimization"

Install / Use

/learn @lns/Dapo
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Distributed Accelerated Reinforcement Learning

This is an implementation of distributed reinforcement learning, used in several published work including Divergence-Augmented Policy Optimization and Exponentially Weighted Imitation Learning for Batched Historical Data

The project depends on a custom distributed replay memory called memoire. We remove the commit logs to protect sensitive IP and password information.

Examples for how to use this project for (distributed) reinforcement learning can be found in example.

For replicating the results of our paper, please refer to the scripts in tools. The main entry point is tools/gen_atari_env.py which can generate the shell script for running experiments in parallel, and plotting results with R.

View on GitHub
GitHub Stars37
CategoryDevelopment
Updated2y ago
Forks4

Languages

Python

Security Score

60/100

Audited on Jan 3, 2024

No findings