[P] Predict figure skating world championship ranking from season performances (part 6: rank aggregation)
I’m trying to predict the ranking of figure skaters in the annual world championship by their scores in earlier competition events in the season. The obvious method to do is by average the scores for each skater across past events and rank them by those averages. However, since no two events are the same, the goal for my project is to separate the skater effect, the intrinsic ability of each skater, by the event effect, how an event influence the score of a skater.
In the previous 5 parts of my projects, I’ve developed several models to predict the ranking of skaters (as outlined in an earlier Reddit post). In this last part of my project, I try to combine these rankings into a final ranking that hopefully will be more accurate than any of the previous rankings individually. You can read the write-up for it here.
I used two different approach to combine the rankings:
-
An unsupervised approach using the centuries-old method of Borda count that is used to tally ranked votes.
-
A supervised approach using logistic regression to combine the scores from each model more intelligently, using the world championship itself as a guide.
Finally, all of the 7 ranking models that I developed in my project are benchmarked on the 5 seasons in the test set. I won’t spoil the details and explanations of the final result (you can see a glimpse of it here), but let’s just say that predicting sports is hard AF!
You can check out the Github repo of the project for all my analyses. I’m more than happy to answer any question or feedback you might have for my project. Thank you for taking the time to read it.
submitted by /u/seismatica
[link] [comments]