[D][RL] Tips on solving short episodes with large action space and huge state space?
I am trying to learn an agent to solve a problem of 2n states and n actions where n~= 15000. I am using a DQN. Roughly 99.9% of actions return to the same state and only m of them change it, they essentially do a bit flip on the state. Each episode can last up to m state changes. Do you guys have any suggestions on what I should read to make my life easier with this problem?