[D] Have we hit the limits of Deep Reinforcement Learning?
As per this thread and this tweet, Open AI Five was trained on something like 45,000 years of gameplay experience, and it took less than one day for humans to figure out strategies to consistently beat it.
Open AI Five, together with AlphaStar, is the largest and most sophisiticated implementation of DRL, and yet it falls short of human intelligence by this huge margin. And I bet that AlphaStar would succumb to the same fate if they released it as a bot for anybody to play with.
I know there is lots of research going on to make DRL more data efficient, and to make deep learning in general more robust to out-of-distribution and adversarial examples, but the gap with humans here is so extreme that I doubt it can be meaningfully closed by anything short of a paradigm shift.
What are your thoughts? Is this the limit of what can be achieved by DRL, or is there still hope to push the paradigm foward?