Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Let’s say we have a simple “autoencoding transformer” architecture:
We can train the model either using:
Now we ask about the properties of Z – the latent representation of the data, after the model is trained. Will Z differ between the two objectives? How will it differ? Will it capture different information? Which loss will preserve more information in Z?
Does this have an obvious interpretation? Any intuitions?
submitted by /u/maskedlanguagemodel
[link] [comments]