[R] Understanding the generalization of “lottery tickets” in neural networks
Sharing our recent blog post summarizing some of our recent work understanding the boundaries of the lottery ticket hypothesis. In particular, we make some progress towards the following questions:
- Do winning ticket initializations contain generic inductive biases or are they overfit to the particular dataset and optimizer used to generate them?
- Is the lottery ticket phenomenon limited to supervised image classification, or is it also present in other domains like RL and NLP?
- Can we begin to explain lottery tickets theoretically?
The blog post is below:
Understanding the generalization of “lottery tickets” in neural networks
And the papers covered can be found here:
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP
Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
Student Specialization in Deep ReLU Networks With Finite Width and Input Dimension
submitted by /u/arimorcos
[link] [comments]