[1905.10027] Neural Temporal-Difference Learning Converges to Global Optima
https://arxiv.org/abs/1905.10027
Does this mean that we can use deep neural networks in TD(0) without worrying about its convergence?
submitted by /u/banananach
[link] [comments]