[D] GANs were invented in 2010?
A method for training artificial neural networks to generate missing data within a variable context. As the idea is hard to put in a single sentence, I will use an example:
An image may have missing pixels (let’s say, under a smudge). How can one restore the missing pixels, knowing only the surrounding pixels? One approach would be a “generator” neural network that, given the surrounding pixels as input, generates the missing pixels.
But how to train such a network? One can’t expect the network to exactly produce the missing pixels. Imagine, for example, that the missing data is a patch of grass. One could teach the network with a bunch of images of lawns, with portions removed. The teacher knows the data that is missing, and could score the network according to the root mean square difference (RMSD) between the generated patch of grass and the original data. The problem is that if the generator encounters an image that is not part of the training set, it would be impossible for the neural network to put all the leaves, especially in the middle of the patch, in exactly the right places. The lowest RMSD error would probably be achieved by the network filling the middle area of the patch with a solid color that is the average of the color of pixels in typical images of grass. If the network tried to generate grass that looks convincing to a human and as such fulfills its purpose, there would be an unfortunate penalty by the RMSD metric.
My idea is this (see figure below): Train simultaneously with the generator a classifier network that is given, in random or alternating sequence, generated and original data. The classifier then has to guess, in the context of the surrounding image context, whether the input is original (1) or generated (0). The generator network is simultaneously trying to get a high score (1) from the classifier. The outcome, hopefully, is that both networks start out really simple, and progress towards generating and recognizing more and more advanced features, approaching and possibly defeating human’s ability to discern between the generated data and the original. If multiple training samples are considered for each score, then RMSD is the correct error metric to use, as this will encourage the classifier network to output probabilities.