Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Hello!
I’d like to tell you about our recent paper. We’ve tackled the problem of a few-shot generation of talking heads: given a few (or even a single) image, train a model that is able to synthesize new images of that particular person with a new pose (viewpoint and expression).
Our model was trained on a publicly available dataset of YouTube videos (VoxCeleb2, 224p) and avoided mode collapse, even though the quality of images here is quite diverse. Hense, we’re able to generalize well for new images with identities unseen during training (we can even run it for paintings and get reasonable results).
The key ingredients are adversarial meta-learning, adversarial fine-tuning and adaptive instance normalization, for more details please refer to the paper, short description of our method as well as the results are in the video below.
ArXiv: https://arxiv.org/abs/1905.08233
Video: https://www.youtube.com/watch?v=p1b5aiTrGzY
submitted by /u/ezakharov
[link] [comments]