[D] AI Scandal: SOTA classifier with 92% ImageNet accuracy scores 2% on new dataset

Written by torontoai on January 6, 2020. Posted in Reddit MachineLearning.

On a new image dataset, unedited, without adversarial noise injection, ResNeXt-50 and DenseNet-121 see their accuracies drop to under 3%. Other former SOTA approaches plummet likewise by unacceptable margins:

– Natural Adversarial Examples – original paper, July 2019

– These Images Fool Neural Networks – TwoMinutePapers clip, 5 mins

So who says it’s a scandal? Well, I do – and I’ve yet to hear an uproar over it. A simple yet disturbing interpretation of these results is – there are millions of images out there that we humans can identify with obviousness and ease, yet our best AI completely flunk.

Thoughts on this? I summarize some of mine below, along a few of authors’ findings.

___________________________________________________________________________________________________________________

Where’d they get the images? The idea’s pretty simple: select a subset classified incorrectly by several top classifiers, and find alike images.

Why do the NN’s fail? Misclassified images tend to have a set of features in common, that can be systematically exploited –> adversarial attacks. Instead of artificially injecting such features, authors find images already containing them: “Networks may rely too heavily on texture and color cues, for instance misclassifying a dragonfly as a banana presumably due to a nearby yellow shovel” (pg. 4).

Implications for research: self-attention mechanisms, e.g. Squeeze-and-Excite, improve accuracy on ImageNet by ~1% – but on this new dataset, by 10%. Likewise, related methods for increased robustness may improve performance on benchmark datasets by a little, but by a lot on adversarial ones.

Thus, instead of pooling all efforts into maximizing F1-score on dataset A, testing against engineered robustness metrics that’ll promise improvement on an unsampled dataset B may be more worthwhile (e.g. “mean corruption error” pg. 8).

Implications for business: you don’t want your bear-catching drone to tranquilize a kid with a teddy.

submitted by /u/OverLordGoldDragon
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] AI Scandal: SOTA classifier with 92% ImageNet accuracy scores 2% on new dataset