[Project] Massively parallel, vectorised implementation of Snake and RL solution
As part of my recent side project to learn about reinforcement learning I’ve created a clone of the classic Snake game as a reinforcement learning environment and solved it with advantage actor-critic. This is one of the warm-ups from OpenAI’s requests for research 2 (https://openai.com/blog/requests-for-research-2/).
You might be thinking this sounds like a very run of the mill introductory RL project. Well here are a few things that I think make it more interesting than just that.
Here’s a GIF of one of the final policies:
Processing gif tdsja08fssz21…
I’m currently working on the “Slitherin'” suggestion on OpenAI’s request for research 2.0. Here’s a preliminary GIF.
Processing gif qkerdp6kusz21…