[P] Using Tacotron To Make Ben Shapiro Sing
100% of the vocals here were generated by my model, not spoken by Ben Shapiro himself, and do not reflect Shapiro’s views. Shapiro’s voice was created with a TTS model I trained using my implementation of the papers “Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis” (https://arxiv.org/abs/1803.09017) and “Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron” (https://arxiv.org/abs/1803.09047), using only just over 2 hours of Shapiro audio (though I suppose that’s more like 3-4 hours worth of speech for the average person). After learning Shapiro’s speech patterns it’s amusing that the speech generated by this model is even faster than the average speed Eminem raps this song (only the part at 3:12 is sped up 1.5x).
submitted by /u/hanyuqn