[D] Adapting Neural Network Architectures
Hi everyone. I have an interesting question: how would you modify a high-performance neural net to an entirely new architecture.
For example, Tesla probably uses some sort of neural net based on convolutions (at least, I think they do; correct me if I’m wrong). On the Tesla autonomy day, Andrej Karpathy mentioned something along the lines of throwing away data after training on it (a system that Elon Musk referred to as Dojo, but the details were not provided).
Suppose that in the future, we realize that some new kind of architecture (transformers, for instance) perform significantly better than convolution-based models. How would Tesla (or anyone that’s using convents, for that matter) adapt their self-driving system to the new technology?
This shouldn’t be a problem if you keep your training data, but what would companies like Tesla, who have training data on a scale that’s infeasible to store, do? Is there any existing technique to “transfer” weights across completely different architectures?