True accuracy calculation in positive-unlabeled setting [D]
When only the positive class is available (e.g. when we have a pattern where matches are positive but failure to match does not mean the case is negative ). Such data set is much easier to create (often domain experts can provide rules to what could qualify as positive, but cannot formulate a rule to rule-out the positive class), but the accuracy metrics calculated on the positive-unlabeled data set are inaccurate.
There are papers showing how to calculate the true accuracy (i.e. the one calculated on a data set containing both negative an positive examples) , but surprisingly they seem to be ignored in more applicative papers – is there a reason? Overall, such method will solve a huge problem especially in the medical domain where the rarity of the positive class requires huge sample size for validation.
The major challenge in these works is estimating the positive class prior, and they propose various algorithms for that (e.g. AlphaMax). Is there any reason not to simply manually review a sample of the positive and unlabeled sets and count the number of positive cases?
- Jain, S., White, M. & Radivojac, P. Recovering True Classifier Performance in Positive-Unlabeled Learning. https://www.ccs.neu.edu/home/radivojac/papers/jain_aaai_2017.pdf
- Jain, S., White, M., Trosset, M. W. & Radivojac, P. Nonparametric semi-supervised learning of class proportions. arXiv:1601.01944 [cs, stat] (2016).