Learn About Our Meetup

4500+ Members

[D] Many papers don’t do hyperparameter search on DNN baselines

A thing that I recognized after reading various DNN model papers is that they often don’t seem to perform hyperparameter search on their / baseline models. Many reported results seem to be for hand-picked configurations only. No search methods (like grid search, Bayesian optimization or even random search) have been used to find the best-performing configurations.

IMO this is a problem: The performance of a DNN models really depends on the choice of hyperparameters, so hypothetically you could make a baseline model perform badly by picking poor hyperparameters.

Why are so many big papers with such an incomplete evaluation out there? Or am I missing something here and it is enough to look at one configuration only?

submitted by /u/alex19111
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat