Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Tensorflow implementation of WaveGlow with VQVAE

code and samples: https://github.com/jaywalnut310/waveglow-vqvae

Hi, I am newbie in here.

Anyway, I am currently working on combining VQVAE and WaveGlow.

WaveGlow is a great model to synthesize speech in a parallel way.

VQVAE is known to good at disentangling speaker identity and linguistic features from raw audio.

As I want to make an efficient multi-speaker voice synthesizer, I have been trying combining those two models.

There are a lot of remaining works though.

So far, What I found from my implementation is

– For single speaker, it works quite well

– For multi speakers, it doesn’t seem to disentangle speaker identity and linguistic features.

I am trying to solve this issue at now, So if you have any idea, please let me know.

Additionally I slightly modified pure VQVAE method with Soft-EM like gradient descent method.

For now, it seems work quite well avoiding hyper parameter tuning and index collapse.

For more information, please see my repository

and if you’re interested, please give me critic comments !

submitted by /u/jaywalnut-310
[link] [comments]