[R] Audio Conversion GAN with Unpaired Data

For the past month I have been working on voice conversion using unpaired data. I naively applied image conversion algorithms to audio spectrograms and after working out a few obstacles I got convincing, although not perfect, results.

Using the exact same algorithm on music genre conversion is also possible and the results, despite a fairly shallow generator with very low capacity, are pretty interesting.

Here are some examples:

https://youtu.be/3BN577LK62Y

The model is able to translate audio signals of any length and does not use any vocoder.

I cannot find papers with similar approaches, and I don’t really know what I should do with this research. Being an Engineering student and not understanding how the academic world works, maybe a simple article and a code release is the best idea.

Thank you for your attention!

submitted by /u/artika_labs
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[R] Audio Conversion GAN with Unpaired Data