[D] relation between the learned parameters of two trained neural networks on the same dataset
I was wondering if there is any work that studies the relation of learned weights between two neural nets.
For example, suppose we have a simple regression task, and we trained an MLP with one hidden layer with 20 neurons. If we train another MLP with 15 neurons in the hidden layer, what would the relation of the weight matrices be between these two networks?
I found some related works on neural network compression literature that start with the bigger model and use matrix pruning with factorization and/or decomposition to reach a smaller model. But, I’m not sure if the obtained parameters will be close to the weights a neural network(with the same parameters as the smaller model) will learn if trained from scratch. I mean, the fact that we can use pruning methods and get good accuracy doesn’t necessarily mean that that is the true relation between the bigger model and the smaller one. What do you think?