[D] What do you see as the most promising directions for reducing sample inefficiency in reinforcement learning?
I often read from ML researchers, but more from computational cognitive scientists, that humans are able to generalize patterns from only a few data points or use “rich, informative priors” even as children, and how that is very important for us as cognitive beings that sets us apart from the current neural network approaches to RL used today.
I’m also not entirely convinced that the current neural net paradigm with the McCulloch–Pitts-esque neurons is ever going to become sample efficient enough for real-world reinforcement learning tasks. It seems like despite our best efforts to increase sample efficiency in NN techniques, the most impressive results still use hundreds of thousands or more simulations/data points that could be infeasible to implement for any sufficiently complex real-world environments.
That being said, what approaches are you most excited for in reducing sample efficiency in reinforcement learning or in neural network techniques in general?