Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] RL library focusing on easy experimental research with cloud experiment management

Hi everyone,

I have created a yet another reinforcement learning library. https://github.com/vwxyzjn/cleanrl

This repository focuses on a clean and minimal implementation of reinforcement learning algorithms. The highlights features of this repo are:

  • Most algorithms are self-contained in single files with a common dependency file common.py that handles different gym spaces.
  • Easy logging of training processes using Tensorboard and Integration with wandb.com to log experiments on the cloud. Check out https://app.wandb.ai/costa-huang/cleanrltest.
  • Easily customizable and being able to debug directly in Python’s interactive shell.
  • Convenient use of commandline arguments for hyper-parameters tuning.

Currently I support A2C, PPO, and DQN. If you are interested, please consider giving it a try 🙂

Motivation:

There are two types of RL library on the two ends of the spectrum. The first one is the demo kind that really just demos what the algorithm is doing, only deals with one gym environment and hard to record experiments and tune parameters.

On the other end of the spectrum, we have OpenAI/baselines, ray-project/ray, and couple google repos. My personal experience with them is that I could only run benchmark with them. They try to write modular code and employ good software engineering practices, but the problem is python is a dynamic language without IDE support. As a result, I had no idea what variable types in different files are and it was very difficult to do any kind of customization. I had to see through dozens of files before even able to try some experiments.

That’s why I created this repo that leans towards the first kind, but has more actual experimental support. I support multiple gym spaces (still working on it), command line arguments to tune parameters, and very seamless experiment logging, all of which are essential characteristics for building a pipeline for research I believe.

submitted by /u/vwxyzjn
[link] [comments]