[N] Hindsight Experience Replay (HER) with SAC/DDPG/DQN support + Evolution Strategy bridge | Stable Baselines v2.6.0
Stable Baselines 2.6.0 was just released. It comes with a bunch of new features and improvements:
– a performance tested Hindsight Experience Replay (HER) re-implementation with SAC, DDPG and DQN support included (only custom DDPG was supported in the original OpenAI Baselines)
– you can now mix Reinforcement Learning (RL) and Evolution Strategies (ES) in few lines of code, thanks to the new get/load parameters method. (see example below with A2C + CMAES)
– a guide was added in the documentation to deal wth NaNs and Infs: https://stable-baselines.readthedocs.io/en/master/guide/checking_nan.html
Gist (for an example of mixing ES and RL): https://gist.github.com/araffin/404ef9625a4a78d42396c5292e465337
Colab Notebook (for testing HER): https://colab.research.google.com/drive/1VDD0uLi8wjUXIqAdLKiK15XaEe0z2FOc#scrollTo=qPg7pyvK_Emi
Documentation: https://stable-baselines.readthedocs.io/en/master/modules/her.html
Full changelog: https://github.com/hill-a/stable-baselines/releases
submitted by /u/araffin2
[link] [comments]