Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Term for keeping test and training data separate

So, I’ve been using the term “data hygiene” for the measures we take in (safety) ML to keep test and training data separate. Stuff like

  • test data on access controlled network share
  • acquiring the test set later and by different teams
  • Thresholdout (when I finally get the chance to play around with it)

But apparently, I just read that term in some fringe paper once and actually data hygiene is a separate concept in data science?!

Does anyone got a good term for the methods/approaches?

submitted by /u/ipsLED87
[link] [comments]