Skip to main content

Blog

Learn About Our Meetup

5000+ Members

GO >

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community.

JOIN

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

JOBS

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

CONTACT

[R] Increase model performance by removing certain subsets of data.

In industry and research workflows today, we greedily acquire, label, and train as much data as possible. While more data usually corresponds with better model performance, this is not always the case. New research in data valuation allows us to target the subsets of our data that would train the best model.

In this article we explore cases where less data is better, and how to identify which data is irrelevant to the machine learning task at hand.

Would love feedback on the article!

submitted by /u/princealiiiii
[link] [comments]