[D] Validating regression models on edge cases?
I’m trying to predict USED car prices, given some x number of parameters.
The R2 is > 0.98 on the testing data, but it misses predictions on new data with edge cases by (what I think of as) too much.
Past the metric for evaluating, how can we validate that a result is good enough, even for an edge case.
Currently, I’m thinking about making some linear regression model and fitting varyingly different age and kilometers, then predicting on price. This would give me a model, where I could predict my edge case predictions on and fit it to a more average case.
I’m really just seeking advice on what to do here. Is the approach good enough? What are other approaches for validation / sanity checking if each sample we try to predict individually is good enough?
submitted by /u/permalip
[link] [comments]