Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Decaying learning rate is a popular practice even for adaptive optimizers such as Adam. Increasing batchsize was also shown to have the same effect.
But there are other hyperparameters with similar nature.
– Does it make sense to decay/increase them?
– Have anyone tried decaying momentum, or decaying droprate, or increasing L2 regularization?
– Are there other hyperparameters that need tuning like this?
submitted by /u/thntk
[link] [comments]