[D] Classification: Having the NN know when it doesn’t know
So I’m working on:
Building an app to classify animals for the visually impaired. Users have an app where they can take a picture and get the name of the animal. If the camera is being pointed somewhere with no animal, it should predict “No Animal”. But ALSO, if the camera points at an animal that I don’t have in my dataset, I’d like it to predict “Unrecognized Animal” so I can store the frame and the manually tag it and feed back to my training set.
Here’s what I’m thinking:
On Data: Have varied images with no animals in them that the network should predict as “No animal”. Take a number of species and have them as “Unrecognized Animal” so the network learns what it doesn’t know (the truth label would be [Animal = 1, Recognized = 0, 0, 0, 0, 0…] vs. the recognized animals e.g. [Animal = 1, Recognized = 1, 0, 0, 1, 0…]). I know the normal approach would be to decide “Unrecognized” based on a threshold of the max predicted confidence, but several papers and empirical evidence show how overconfident nets can be…?). I’m not too sure.
On loss function: I was just going to use cross entropy for each of the three terms (Animal/No Animal, Recognized/Not Recognized, Animal classification) and have them weighted.
Is my approach in the right direction? I don’t know how to express this problem well enough to find good results on google but this must have been solved before right?
Thanks to any ideas!