[D]Follow The Regularized Leader(FTRL) algorithm to model user interests (which might change over time)?

Written by torontoai on May 8, 2019. Posted in Reddit MachineLearning.

Hi there,

I’ve been working on news recommendation problem, and I’m implementing a factorization machine optimized by FTRL (the original paper is here: https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/41159.pdf) to model user interests, in an online learning fashion. As users’ reading interests might change over time, I want the model to capture this change rapidly, hopefully. But the learning rate of FTRL decays as the following equation:

FTRL per-coordinate learning rate

learning rate decreases monotonically over time, So as training goes for some certain time, model change might become very slow, thus hard to follow the users’ interest.

What I’m trying is, do not accumulate the square of gradients from the beginning, just accumulate recent ones. In order to do this, I change to gradient accumulating line in the pseudo code to the following, where lambda is a number in (0,1), like 0.99 or something like that. In this way, I hope the gradients long ago make little contribution to the denominator.

Is there someone familiar with FTRL could tell me does this make sense, or is it valid in math? because the mathematics behind FTRL is just beyond me.

Thanks in advance : )

changed gradient accumulating style

FTRL original implementation

submitted by /u/hunter7z
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D]Follow The Regularized Leader(FTRL) algorithm to model user interests (which might change over time)?