[D] AISTATS 2020 Reviews
AISTATS 2020 reviews are marked for release on Nov 24, 2019. Here’s a thread to discuss this year’s reviews. Godspeed, everyone!
submitted by /u/donb1988
[link] [comments]
AISTATS 2020 reviews are marked for release on Nov 24, 2019. Here’s a thread to discuss this year’s reviews. Godspeed, everyone!
submitted by /u/donb1988
[link] [comments]
Link to paper: https://arxiv.org/abs/1901.10514
Link to my reimplementation: https://github.com/Abhishaike/HyperProtoNetReproduce
This is a Pytorch reimplementation of the NeurIPS 2019 paper ‘Hyperspherical Prototype Networks’ in Pytorch. This paper proposes an extension to Prototype Networks, in which the prototypes are placed a priori with large margin separation, and remain unchanged during the training/testing process of the model. The paper suggests that this extension allows for more flexible classification, regression, and joint multi-task training of regression/classification, and with higher accuracy compared to typical Prototype Networks.
This repo includes reproduced benchmarks for most of their datasets. Largely the same accuracy/error, but quite off on CIFAR-100 (not ImageNet-200 though for some reason), so it’s possible this is an issue on my end.
I also found their use of SGD for prototype creation to be unusual, considering that, the way they phrased the prototype problem, it seems like a job more for a constrained optimization algorithm. Alongside the SGD implementation (which are used for the included benchmarks), I added in two other optimization algorithms, one unconstrained (BFGS) and one constrained (SLSQP). These didn’t seem to change the results much.
This is my first reimplementation of a paper, so any critiques would be great!
submitted by /u/ACTBRUH
[link] [comments]
What kind of work do you guys do?
submitted by /u/Ctown_struggles00
[link] [comments]
Pretty much what it says in the title, but to elaborate on the two questions:
Why isn’t Chainer more used? It seems to have pioneered some nice ideas like model subclassing, has many interesting “sub-libraries”, like ChainerCV and ChainerRL, but I think it is barely used outside Japan (not exactly sure if they use it a lot either)
And then in the same vein, why is it not more discussed when talking about Deep Learning frameworks? We see a lot of comparison between Pytorch and Tensorflow, then maybe some MxNet and new players (Jax, Halide) and non-python frameworks (mostly Julia’s), but Chainer almost seems to be ignored in most of these discussions.
(Also feel welcome to comment on pretty much the same questions but regarding Cupy vs Numba or similar)
submitted by /u/TheAlgorithmist99
[link] [comments]
I’d like to become more proficient in Python and ML. Have you found a method to train yourself?
submitted by /u/Knackmanic
[link] [comments]
I’m trying to predict the ranking of figure skaters in the annual world championship by their scores in earlier competition events in the season. The obvious method to do is by average the scores for each skater across past events and rank them by those averages. However, since no two events are the same, the goal for my project is to separate the skater effect, the intrinsic ability of each skater, by the event effect, how an event influence the score of a skater.
I’ve previously posted on Reddit my attempts to do so using simple linear models, which you can read on Medium part 1 of my project. These models will output a latent score for each skater that we can use to rank them.
However, another approach to learn the latent scores of skater is to think of factorizing the event-skater matrix of raw scores in the season into a skater-specific matrix and an event-specific matrix that multiply together to approximate the raw score. Therefore, this is exactly the same as the matrix factorization in recommender systems, but with user=skater, item=event, and rating=raw score.
As a result, I used a variant of the famous FunkSVD algorithm to learn the latent scores of skater. In part 2 of my project, I tried finding just a single latent score for each skater, and rank skaters by those scores. Next, in part 3, I learned multiple latent factors for each skater using the same FunkSVD method. Since I’m implementing it from scratch, I try using various implementations of the algorithm: from a naive approach using for loop, to one using numpy broadcasting, and one using matrix multiplication, and benchmark them both in time and space complexity.
However, one major problem with multiple factors is that it’s hard to know which factor to rank skater with. Thankfully, the ranking metric I use in the project (Kendall’s tau) allows me to build a simple logistic regression model to combine these scores to rank the skaters. This can be done with pairwise differences in score in each factor as predictors, and the world championship ranking itself as the response. I later learned that this belong to the pairwise learning-to-rank methods often encountered in information system, and you can read my implementation of it in part 4.
However, the result at the end of this part was not very encouraging, likely due to the way that I use FunkSVD to train the latent factors. Therefore, I part 5, I modified my FunkSVD implementation to solve this problem, by training the factors in sequence instead of all at once. I then discovered afterward that Mr Funk also originally trained all of his factors in sequence, so I should have read his work more carefully at the start!
You can see all the code I used for my project in the Github repo. I’m more than happy to receive any questions or feedback from you guys on my project!
submitted by /u/seismatica
[link] [comments]
| |
TSNE is a very popular data visualization algorithm used alongside PCA and UMAP. Sklearn’s TSNE is very effective for small datasets, but on the 60,000 MNIST Digits dataset, expect to wait 1 hour. With RAPIDS cuML, TSNE on MNIST runs in 3 seconds! On 200,000 rows, Sklearn takes a whopping 3 hours, whilst RAPIDS takes 5 seconds! (2,000x faster). Figure 1. cuML TSNE on MNIST Fashion takes 3 seconds. Scikit-Learn takes 1 hour. Check out my blog showcasing how cuML achieves this massive performance boost, and how NVIDIA GPUs can help scientists and engineers save their precious time. https://medium.com/rapids-ai/tsne-with-gpus-hours-to-seconds-9d9c17c941db Figure 2. TSNE used on the 60,000 Fashion MNIST dataset (3 seconds) Give cuML a try! You might know me as the author of HyperLearn, and I can say cuML is the gold standard package for machine learning on GPUs! https://github.com/rapidsai/cuml Linear Regression, UMAP, K-Means, DBSCAN etc are all sped up on the GPU! If you have any questions, feel free to ask! Table 1. cuML’s TSNE time running on an NVIDIA DGX-1 with using 1 V100 GPU. Finally, a big drawback of current GPU implementations is its memory consumption. With cuML TSNE, we use 30% less GPU memory! In a future release, this will be shaved by 33% again to a total of 50% memory reductions! We will also support PCA initialization. submitted by /u/danielhanchen |
Deep learning papers often have very good diagrams of their architectures. Does anyone know of tools that can be used to generate these sorts of diagrams. I’m not looking for automatically generated diagrams.
So my question is this:
What kind of software do people use to make nice looking Visualizations for their network architecture. A really nice example is the pointnet architecture. So does anyone know what was or could have been used to generate that architecture diagram.
submitted by /u/ssd123456789
[link] [comments]
I’m reviewing for ICLR myself, so I know reading the revised papers and carefully reading all the lengthy rebuttals feels like a terrible time-sink, but to everyone else who’s also reviewing: please remember that most authors have spent an enormous time and effort in their submissions.
I’ve noticed that many reviews have already been updated after the rebuttal period but it seems that most if not all miss key points that are addressed in the rebuttal or in the revised paper. There’s an option to compare revisions which highlights the changes — please use this feature, as some authors address points in the revision but don’t mention it explicitly in the rebuttal (this actually happened for all papers I’m reviewing). I submitted a paper myself and my reviews were shorter than last year, and I also have the feeling that my rebuttal wasn’t carefully read by the reviewers who updated their reviews.
I also know that many reviewers this year are reviewing for the first time, but please do make an effort to spend some time going over rebuttals and revisions. You’re now part of the ML academic community — try to make it better, we need it especially now that the many of the highest-rated papers have extremely short reviews with low confidence scores, including reviews as short as 20 words.
TLDR: we all know there are not enough reviewers and way too many submissions. while reviewing for free can be frustrating, the community depends on us and the job includes being thoughtful and reading rebuttals/revisions carefully
submitted by /u/watercannon123
[link] [comments]