[D] How do you combine two learned models trained on partitioned data of the same distribution, to a single model?
If we have a single large dataset D, and partitioned into A and B and put on devices.
I make two replicas of M1 and M2 from a same model M and also with same initial weights(same seed). I put these on the above two devices and train them separately.
How do I combine(if it’s the word) these two models’ experience into a single one(like a model which has learned the entire data D)?