[D] Measuring statistical modelling capabilities of neural networks?
Does anybody know a method of measuring how well a neural network (theoretically) models a distribution? I’m especially interested in neural machine translation, as there are many ways to model the same distribution but no theoretical framework (as far as I found) to find out which approach is actually more capable. For example one could model p(y|x), with x being the raw data, or being the representation after performing many non linearities on x. Statistically, it is the same (or at least there are many papers where authors claim they are the same), but in practise they give completely different capabilities in the distribution.
submitted by /u/ggNikita