Collaborate on an idea for initialization of neural networks [R] [P]
Most weight initialization strategies for neural networks depend on controlling variance (and mean) of the propagating signals. However, they rely on some assumptions which do not hold always in the practice.
For example using relu immediately shifts the mean.
I am thinking it might be possible to learn the initialization of the network with an appropriate cost function. A cost function to adjust weights so the variance of the input variance smoothly transforms into output variance.
Anybody wants to collaborate on a quick project to test these, perhaps somebody who has worked on a problem like this before.
I have tried something preliminary but I need advice from an experienced NN experimenter:)