[R] MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets
- Project page: https://hyperconnect.github.io/MarioNETte/
- Paper: https://arxiv.org/abs/1911.08139
- Video: https://youtu.be/Y6HE1DtdJHg
Here’s a summary of the work. We can perform face reenactments under a few-shot or even a one-shot setting, where only a single target face image is provided. Previous approaches to face reenactments had a hard time preserving the identity of the target and tried to avoid the problem through fine-tuning or choosing a driver that does not diverge too much from the target. We tried to tackle this “identity preservation problem” through several novel components.
Instead of working with a spatial-agnostic representation of a driver or a target, we encode the style information to a spatial-information preserving representation. This allows us to maintain the details that easily get lost when utilizing spatial-agnostic representation such as those attained from AdaIN layers.
We proposed image attention blocks and feature alignment modules to attend to a specific location and warp feature-level information as well. Combining attention and flow allows us to naturally deal with multiple target images, making the proposed model to gracefully handle one-shot and few-shot settings without resorting to reductions such as sum/max/average pooling.
Another part of the contribution is the landmark transformer, where we alleviate the identity preservation problem even further. When the driver’s landmark differs a lot from that of the target, the reenacted face tends to resemble the driver’s facial characteristics. Landmark transformer disentangles the identity and expression and can be trained in an unsupervised fashion.
Check out the video and tell us what you think. Thanks!