Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
https://medium.com/@lessw/new-state-of-the-art-ai-optimizer-rectified-adam-radam-5d854730807b
This blog post discusses a new optimizer built on top of Adam, introduced in this paper by Liyuan Liu et al.. Essentially, they seek to understand why a warmup phase is beneficial for scheduling learning rates, and then identify the underlying problem to be related to high variance and poor generalization during the first few batches. They find that the issue can be remedied by using either a warmup/low initial learning rate, or by turning off momentum for the first couple of batches. As more training examples are fed in, the variance stabilizes and the learning rate/momentum can be increased. They therefore proposed a Rectified Adam optimizer that dynamically changes the momentum in a way that hedges against high variance. The author of the blog post tests an implementation in Fastai and finds that RAdam works well in many different contexts, enough to take the leaderboard of the Imagenette mini-competition.
Implementations can be found on the author’s Github.
submitted by /u/jwuphysics
[link] [comments]