Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Overfitting vs. Generalization – a subtle difference

In my view, overfitting does not necessarily imply lack of generalization, just as well as generalization cannot be directly associated to degree of overfitting.

An overfit model is a model that is tuned to generate the highest performance (e.g. lowest loss) on the dataset it was trained with. This can be tested by the difference between the losses on the validation set and on the training set. In order to test for overfitting, training and validation sets should have similar distributions. If that’s the case, an overfit model will deviate in performance on the validation set from the training performance. This is because, even if the distributions are similar, the model is tuned to pick up correctly only the samples it has seen on the training set.

As for generalization, it can only be evaluated between datasets (test and training) that have different distributions. Ideally, the test distribution will be the most heterogeneous of them all. In my opinion, this is the only way to really assess generalization: the difference between the losses on training versus testing set.

TLDR: Overfitting is indicated by when model underperforms on unseen data with similar distributions to seen data. Generalization, on the other hand, is indicated by the performance differences between seen and unseen data with different distributions, where the unseen data ideally represents real world distributions.

I think this is a misconception most have, even in industry.

What are your thoughts?

submitted by /u/eigenlaplace
[link] [comments]