[D] Machine Learning vs Statistics
I know this is an old debate, but I was talking to one of my colleagues from work and something he said struck me as really odd. He said, Statistics is more concerned with inference than results. After that, I did a little bit of internet research and found the same narrative there too. How Statistics does not have train-test split, and not concerned with the performance on unseen data, etc.
But this led me to the line of thinking that, Statistics (if it’s not concerned with unseen data performance) is doing something wrong.
If you fit to your train set perfectly with an interpretable model, but the performance on unseen data is dismal, then should we really take the interpretations from such a model as the truth?
Looking towards all the statisticians out there, to tell me I’m wrong and why.