Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Adding more data will make the model perform worse ?

Hi, I am using XGboost regressor for a personal project. Initially I used a data set with measurements from 01.Jan.2016 to 24.Dec.2018 and I got those results : MAE = 2.332 , MSE = 7.764 for testing data. I recently got from the same source, the same data set but with measurements from 01.Jan.2016 up to 14.May.2019 and for testing data I got those results : MAE = 2.729 , MSE = 12.002. I have tuned the hyperparameters, in both cases, using the same method through cv. I tried to adjust the parameters more for the second data set but I did not get better results. Even if the differences are not very high, the fact that I used a larger data set could have affected the performance or is something I have overlooked?

submitted by /u/Bigdey
[link] [comments]