[P] Full Chainer Implementation OpenAI Random Network Distillation
I released my implementation of OpenAI’s Reinforcement Learning using Random Network Distillation. The implementation is fairly complete, done almost exactly as laid out in the paper. Check it out at https://github.com/AdeelMufti/RL-RND.
Interestingly, I tried it on PLE’s PixelCopter where I turned off the extrinsic rewards altogether, and it got roughly the same results with the extrinsic rewards. I wrote about it here: http://blog.adeel.io/2019/04/13/reinforcement-learning-using-intrinsic-rewards-through-random-network-distillation-in-chainer/
Someone with a free GPU sitting around mind spinning it up in Montezuma’s Revenge for a while? I’m curious to see what this implementation will achieve.