[P] Seeing music using deepsing: Creating machine-generated visual stories of songs
Can machines dream while listening to music? Is it possible to turn music into images in a meaningful way? deepsing was born to materialize our idea of translating audio to images inspired by Futurama Holophoner. In this way, deepsing is able to autonomously generate visual stories which convey the emotions expressed in songs. The process of such music-to-image translation poses unique challenges, mainly due to the unstable mapping between the different modalities involved in this process. To overcome these limitations, deepsing employs a trainable cross-modal translation method, leading to a deep learning method for generating sentiment-aware visual stories.
We have implemented a front-end to our method at https://www.deepsing.com You can find an example of a purely machine-generated visual story using our method at https://deepsing.com/engine/9C0xGB73Uuc/5dfbcd1ec9e5f7311d8a9fcf Note that the version available at https://www.deepsing.com is currently lacking many essential features, but demonstrates the basic concept of our idea! Also, note that song lyrics are NOT used in this process, since the proposed method currently works based SOLELY on the sentiment induced by the audio!
Furthermore, you can find more information in our preprint https://arxiv.org/abs/1912.05654, while we have also released the code of our method at https://github.com/deepsing-ai/deepsing Feel free to hack with us and share your opinions with us!