[D] – Initialise network with sub networks from pre-trained networks
Thinking about the lottery ticket hypothesis and masking randomly initialised networks…
I think the following would be successful:
Have a database of many pretrained networks, BERT, RESNET, etc…
Draw random subnetworks from this database.
Initialise the network to be created using this sample.
Train the network and sparsify the network aggressively. Only preserving parts that are very useful.
Repopulate the masked areas of the network using another random sample from the network database.
The underlying assumption being that when we train networks we are finding the networks within the random initialisation which are already useful and tuning them. By sampling from the neural network database we are sampling from the space of networks that have already been found to be useful for one task or another and can therefore initialise our network in a more intelligent way. Piggybacking on the large scale compute poured into existing high quality networks.