[N] Hindsight Experience Replay (HER) with SAC/DDPG/DQN support + Evolution Strategy bridge | Stable Baselines v2.6.0

Written by torontoai on June 16, 2019. Posted in Reddit MachineLearning.

Stable Baselines 2.6.0 was just released. It comes with a bunch of new features and improvements:

– a performance tested Hindsight Experience Replay (HER) re-implementation with SAC, DDPG and DQN support included (only custom DDPG was supported in the original OpenAI Baselines)

– you can now mix Reinforcement Learning (RL) and Evolution Strategies (ES) in few lines of code, thanks to the new get/load parameters method. (see example below with A2C + CMAES)

– a guide was added in the documentation to deal wth NaNs and Infs: https://stable-baselines.readthedocs.io/en/master/guide/checking_nan.html

Gist (for an example of mixing ES and RL): https://gist.github.com/araffin/404ef9625a4a78d42396c5292e465337

Colab Notebook (for testing HER): https://colab.research.google.com/drive/1VDD0uLi8wjUXIqAdLKiK15XaEe0z2FOc#scrollTo=qPg7pyvK_Emi

Documentation: https://stable-baselines.readthedocs.io/en/master/modules/her.html

Full changelog: https://github.com/hill-a/stable-baselines/releases

submitted by /u/araffin2
[link] [comments]