Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Let’s say someone gives you a big, challenging, labeled dataset to train a model on. How do you tell the labels aren’t random for the most part and putting energy into training a model isn’t a waste of time?

The data could be any kind of data, but it requires an expert to annotate it correctly, and since you’re not an expert in that particular area, you can’t eye-check if the labels make sense. You also try some baseline attempts that can overfit the training data but fail hard on every validation split. How to tell at this point whether the problem is just really challenging or whether the data’s labels are bad/wrong/random?

submitted by /u/DeepDeeperRIPgradien
[link] [comments]