Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Predict figure skating world championship ranking from season performances (part 2: hybrid models learned by gradient descent)

I recently posted the write-up on the first part of my project (Github repo) to predict how skaters would rank in the figure skating world championship from earlier scores that they earned in the season. The main idea is to separate the skater effect, the intrinsic ability of each skater, from the event effect, the influence of an event on a skater’s performance, so that a more accurate ranking could be built.

In that previous part, I considered two simple models to find the latent skater scores that are used to rank the skaters:

  1. Score of a skater at an event = baseline score + latent skater score + latent event score

  2. Score of a skater at an event = baseline score × latent skater score × latent event score

In this part of the project (analysis, write-up), I consider a hybrid model of those two:

Score of a skater at an event = baseline score + latent skater score × latent event score

Unfortunately, this model does not have a closed-form solution to learn the parameters as opposed to the earlier models. Therefore, gradient descent was used to learn them, which resulted in this neat little animation that tracks how the model residuals, RMSE, as well as predicted ranking gets better and better as gradient descent runs. I also explore different strategies to reduce model overfit (so that it can predict skater ranking more accurately), using familiar methods such as model penalization and early stopping.

Lastly, note that this hybrid model is nothing but factorizing the event-skater score matrix into an event-specific vector and skater-specific vector, which can multiply together to approximate the score matrix. Therefore, the gradient descent to learn the values of these latent vectors is very similar to that of the famous FunkSVD algorithm to learn the user-specific and item-specific latent factors, which can multiply together to approximate the rating matrix of a recommendation system (in this case user=skater, and item=event). However, FunkSVD was used with multiple factors, and in the next part of my project, I will show how multi-factor matrix factorization can be applied to this ranking problem.

If you have any question or feedback on this, just let me know 🙂

submitted by /u/seismatica
[link] [comments]