Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Multi-style disentanglement and Unsupervised aesthetics prediction of music by predicting future and analysing past

The problem with aesthetics prediction is that it’s learned on datasets provided by some of the users which might not reflect the diversity of aesthetics perception of different people and have poor generalization ability..

I’ve been thinking about usage of Content & Style disentanglement for learning several styles (and also relations between them), and then feed on mini-supervision given by a human by selecting personally most beautiful images, which would make the algorithm look for the most similiar styles..

However, to make the model exposed to as many styles as possible, the model shall have intrinsic motivation (curiosity) to explore those which it struggle more to disentangle than those which it already disentangled (almost) successfully, and then learn to combine them, followed by a model to disentangle several styles & single content..

The next topic is music creation, today’s best-performing models apparently learn on several musical genres, and then synthetize a new sample by starting out-of-scratch and then predicting the next note until it reaches several minutes..

To make the music piece more tense, i believe it might require these steps to be in place..

1.Learn one model to predict the future arrangement of a song at any given time, and then make the generator minimize the certainty of this model

2.Learn another model to analyze the past of the song, and maximize the recognition rate (certainty) of this one by the generator as well..

(These two models may share their knowledge, as the task is done on the same musical pieces..)

So, both of these models are learnt on recognizing genres & learn on their patterns, except the first one focus on future (which has not been heard yet by the model), and the other one on past (which has been already heard)

Obviously, the aesthetics score would be produced by the models rewards based on analyzing/predicting on that specific song (as specified in the two steps above)

submitted by /u/ad48hp
[link] [comments]