[D] Why does deep reinforcement learning not generalize?
Multiple groups agree on this issue:
“Assessing Generalization in Deep Reinforcement Learning” https://bair.berkeley.edu/blog/2019/03/18/rl-generalization/
We present a benchmark for studying generalization in deep reinforcement learning (RL). Systematic empirical evaluation shows that vanilla deep RL algorithms generalize better than specialized deep RL algorithms designed specifically for generalization. In other words, simply training on varied environments is so far the most effective strategy for generalization.
“Quantifying Generalization in Reinforcement Learning” https://openai.com/blog/quantifying-generalization-in-reinforcement-learning/
Generalizing between tasks remains difficult for state of the art deep reinforcement learning (RL) algorithms. Although trained agents can solve complex tasks, they struggle to transfer their experience to new environments. Even though people know that RL agents tend to overfit — that is, to latch onto the specifics of their environment rather than learn generalizable skills — RL agents are still benchmarked by evaluating on the environments they trained on. This would be like testing on your training set in supervised learning!
Why is this issue specific to deep RL? Is it just simply the evaluation metrics the field has been using (training on the test set)?