[R] Neural Networks with non-smooth loss?
I’m a student researcher looking for literature on neural network parameter optimization where the objective loss is non-smooth. Meaning that that the typical gradient based methods are ruled out and something like proximal gradient methods are employed. Preferably in the context of regression. This condition seems to be commonly ignored in practice.
- Are non differentialable losses avoided in NN’s?
- Is there a need for this kind of work from a non theoretical point of view? That is, smoothness conditions are violated, but gradient methods still find empirical success?
I have many more questions, but really any direction or content would be helpful! Thanks!