Learn About Our Meetup

4500+ Members

[D] Rectified Adam (RAdam): a new state of the art optimizer

This blog post discusses a new optimizer built on top of Adam, introduced in this paper by Liyuan Liu et al.. Essentially, they seek to understand why a warmup phase is beneficial for scheduling learning rates, and then identify the underlying problem to be related to high variance and poor generalization during the first few batches. They find that the issue can be remedied by using either a warmup/low initial learning rate, or by turning off momentum for the first couple of batches. As more training examples are fed in, the variance stabilizes and the learning rate/momentum can be increased. They therefore proposed a Rectified Adam optimizer that dynamically changes the momentum in a way that hedges against high variance. The author of the blog post tests an implementation in Fastai and finds that RAdam works well in many different contexts, enough to take the leaderboard of the Imagenette mini-competition.

Implementations can be found on the author’s Github.

submitted by /u/jwuphysics
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.