Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Probabilistic Cityscapes scene generator

[P] Probabilistic Cityscapes scene generator

Hey all, I’ve been working on a custom generative model for a while and I just trained it on the Cityscapes dataset. I was pleased with what it learned to produce in just 5 hours of training on a single GPU (1080 Ti), and also I think the generation process itself looks pretty neat so I made a video of that too.

Here’s 25 non cherry picked results

https://i.redd.it/1hr1c1cc1us21.png

And here’s a video showing the generation process (different run, different result)

https://reddit.com/link/be8fe1/video/qgbpckog1us21/player

As you might be able to tell, it’s an autoregressive model. However, it’s different from PixelCNN and co in the sense that it doesn’t sample from top left to bottom right, but instead it samples at random positions. The benefit is that as you get more samples, the dependencies between pixels get more and more local and you can get away with sampling more than a single subpixel per inference step as long as they are sufficiently far apart. In this example, it takes 145 steps to sample 24576 subpixels (64x128x3) so that’s only like 0.6% of the amount of steps you need with a PixelCNN. I know I’m not the first one with this idea but I’m surprised with how well it seems to works. There’s some more details I’m going to keep to myself for now, but I’m curious to hear what you think of the result so far.

I think I should be able to scale it up to at least double this resolution on my single GPU, but first I want to try it on some other datasets. In fact, first thing I’m going to try it on is on some raw audio data to see if the same principle of parallel sampling works in that domain too.

submitted by /u/zarcomup
[link] [comments]