[D] State of the art optimizers
I wasn’t sure if there is a consensus on this. Of course, there is widespread use of SGD with momentum, Adam, RMSProp, Adagrad, Adadelta, and probably others — but is there an optimizer that is considered SOTA for DNNs “most of the time”? Or is it basically accepted that there is a collection of “good” optimizers whose efficacy varies depending on the task and architecture?
submitted by /u/doctorjuice