[P] Ideas needed on simulating/training Burglar and Guard Agents
I’ve also posted this on r/reinforcementlearning, but in order to have a bigger reach, I thought it would be good to repost here.
For a research project, I want to create a simulation on how burglars and guards behave in a grid world. The idea is to have two different types of agents where the burglar agent needs to get one of the treasures (somewhere in the grid world) and then has to reach the edge of said grid world. The guard needs to both protect these treasures and catch the burglar.
We want to use the simulation to learn three things:
- What strategy does the burglar use?
- What strategy is most optimal for the guards?
- What types of information do benefit the guards the most from? (e.g. treasure locations, big detection range, imprecise burglar location)
I would love to add terrain types and other things to improve the realism of the simulation, but the problem is probably hard enough as is. Do any of you guys have ideas on how to accomplish this?
Thanks in advance