Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Model is tacotron2 based on this repo.
So, I made it work with Pytorch DDP, and it works, but the gap between single and distributed train seems to me too large.
So, single GPU loss much better, stable and 8 GPUs give only x2 time gain with x8 costs.
Do I miss something obvious?
Maybe because of batchnorm? Tried sync batch norm, but it does not really make a difference.
submitted by /u/hadaev
[link] [comments]