[Discussion] How to interpret probabilities?

Written by torontoai on May 6, 2019. Posted in Reddit MachineLearning.

So I’ve worked with machine learning a lot, but never actually had to deal directly with probabilities/probability output, and what they actually mean in a real world scenario.

So, say we have 4 balanced classes [1,2,3, and 4], and we train a Keras classifier and get the probabilities for a given prediction in the test set:

GT: Class 1

Output: [0.4, 0.3, 0.1, 0.2]

Great – the classifier predicted Class 1 correctly. But how should we interpret the 0.4 probability that was attached to the class?

Does it mean that given that particular feature vector, class 1 will be the correct choice 40% of the time?

Where does the actual performance/accuracy of the classifier come in to this?

For example, say the above classifier was trained with 100k training examples, had an accuracy of ‘50%’, and produced the probabilities of [0.4, 0.3, 0.1, 0.2]. Now imagine I got more training data and trained with 500k training examples, achieved an accuracy of 60%, and still got the probabilities of [0.4, 0.3, 0.1, 0.2]. Does the more accurate classifier better represent the actual real-life probability of the events?

tl;dr: I guess what I’m actually asking is – how do we know/measure/compare the accuracy of the probabilities given from a classifier?

submitted by /u/Zman420
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[Discussion] How to interpret probabilities?