[D] Why are arithmetic operations of latent variables meaningful?
I’ve noticed that in a lot of latent variable models, a lot of authors will perform arithmetic operations on the latent space and show that they have meaning e.g. ‘king – men + woman = queen’ in word2vec, the idea of attribute vectors for VAE, and even linear interpolation for VAEs.
What part of training makes this happen? For concreteness, let’s look at VAEs for the time being, with the usual Gaussian prior. It would seem like linear interpolation in this case could yield bad results, since there’s a good chance that at some point in the interpolation we could pass by a vector of smalll norm, which would be very unlikely to be sampled from a Gaussian in the latent space has high dimension. In fact, some papers even make references to this and use things like SLERP. Nevertheless, the results clearly work. Is there a theoretical justification for why these operations have meaning? Why should we even expect a properly-trained VAE to exhibit these properties?