Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D]Follow The Regularized Leader(FTRL) algorithm to model user interests (which might change over time)?

[D]Follow The Regularized Leader(FTRL) algorithm to model user interests (which might change over time)?

Hi there,

I’ve been working on news recommendation problem, and I’m implementing a factorization machine optimized by FTRL (the original paper is here: to model user interests, in an online learning fashion. As users’ reading interests might change over time, I want the model to capture this change rapidly, hopefully. But the learning rate of FTRL decays as the following equation:

FTRL per-coordinate learning rate

learning rate decreases monotonically over time, So as training goes for some certain time, model change might become very slow, thus hard to follow the users’ interest.

What I’m trying is, do not accumulate the square of gradients from the beginning, just accumulate recent ones. In order to do this, I change to gradient accumulating line in the pseudo code to the following, where lambda is a number in (0,1), like 0.99 or something like that. In this way, I hope the gradients long ago make little contribution to the denominator.

Is there someone familiar with FTRL could tell me does this make sense, or is it valid in math? because the mathematics behind FTRL is just beyond me.

Thanks in advance : )

changed gradient accumulating style

FTRL original implementation

submitted by /u/hunter7z
[link] [comments]

Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.