Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] How would one detect data leakage in someone else’s model?

Pure hypothetical. Let’s say I have someone’s model (i.e. their final model weights) and also their train and test set. I don’t have any additional validation readily available.

What kind of heuristics can be used to evaluate if there was data leakage from the test set?

I’d like to distinguish the two cases 1) there is data leakage, 2) the model is really good. Based on just performance metrics on the test/train set, I dont feel like I can distinguish these two cases. Would it be impossible to tell without additional validation data?

submitted by /u/CrazyAsparagus
[link] [comments]