[D] Neural network : Applying activity regularization in layer space
I’m trying to build an auto-encoder with an activity regularization that is a function of the output of the layer for a single observation. As i understand it, activity regularization is usually done over the output of a cell for each batch, to promote balanced and sparse activation of each cell. Is that correct?
In this particular case, i want to promote a sparse activation of the layer in such a way that 1/ cells activation influence each others 2/ the mean activation per cell will not be necessarily balanced . I’m aware that if i’m using the L1 norm, the axis over which the sum is done doesn’t matter, so i plan to use a tweaked L(1/2) norm.
Does this make sense, and is there a simple way to do this in Keras?
PS : so far i’ve done something in that spirit by alternatively training the encoder on its own modified output (like putting the maximum activation to 1 and the rest to 0, pretty brutal) and the autoencoder on the training data. kind of works, but it’s slow and could be a lot better