Torchrl
Pytorch Implementation of Reinforcement Learning Algorithms ( Soft Actor Critic(SAC)/ DDPG / TD3 /DQN / A2C/ PPO / TRPO)
Install / Use
/learn @RchalYang/TorchrlREADME
TorchRL
Pytorch Implementation for RL Methods
Environments with continuous & discrete action space are supported.
Environments with 1d & 3d observation space are supported.
Multi-Process Env is supported
Requirements
- General Requirements
- Pytorch 1.7
- Gym(0.10.9)
- Mujoco(1.50.1)
- tabulate (for log)
- tensorboardX (log file output)
- Tensorboard Requirements
- Tensorflow: to start tensorboard or read log in tf records
Installation
- use use environment.yml to create virtual envrionment
conda create -f environment.yml
source activate py_off
- Mannually install all requirements
Usage
specify parameters for algorithms in config file & specify log directory / seed / device in argument
python examples/ppo_continuous_vec.py --config config/ppo_halfcheetah.json --seed 0 --device 0 --id ppo_halfcheetah
Checkout examples folder for detailed informations
Currently contains:
- On-Policy Methods:
- Reinforce
- A2C(Actor Critic)
- PPO(Proximal Policy Optimization)
- TRPO
- Off-Policy Methods:
- Soft Actor Critic: SAC(TwinSAC)
- Deep Deterministic Policy Gradient :DDPG
- TD3
- DQN:
- Basic Double DQN
- Bootstrapped DQN
- QRDQN
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
