Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Why do Variational Autoencoders encode each datapoint to an individual normal distribution over z, rather than forcing all encodings Z to be normally distributed?

As in the title. Variational autoencoders encode each data sample x_i to a distribution over z, and then minimize the KL divergence between q(z_i |x_i) and p(z), where p(z) is N(0, I). In cases where the encoder does a good job of minimizing the KL loss, the reconstruction is often poor, and in cases where the reconstruction is good, the encoder may not do a good job of mapping onto p(z).

Is there some reason why we can’t just feed in all datapoints from x, which gives us a distribution over all encodings z, and then force those encodings to be normally distributed (i.e. find the mean and stdev over z, and penalize its distance from N(0,I))? This way, you don’t even need to use the reparameterization trick. If you wanted to, you could also still have each point be a distribution, you just need to take each individual variance into account as well as the means.

I’ve tested this out and it works without any issue, so is there some theoretical reason why it’s not done this way? Is it standard practice in variational methods for each datapoint i_i to have its own distribution, and if so, why?

submitted by /u/DearJudge
[link] [comments]