Level of confidence for binary classification

Question

I’m relatively new to PyTorch and deep learning. I was able to create a model and analyze a data set for both a training and test set in a binary classification problem. Everything is working well. Now I’m setting up a program where a user can enter specific values for each of the 17 features (both continuous and categorical variables) and then put these values into a tensor and then pass it through the model to make a prediction. This is all working fine as well.

But now I would like to add a “percent confidence” output. Currently the output is “yes” or “No” (print statement) dependent on the argmax of the output tensor which has two values. But I would also like to add a level of confidence% for the particular output. I’m sure this is possible somehow. Can I maybe use the raw values of the output tensor to somehow determine this level of confidence? That was just an idea I had, not sure how to approach this. Thanks for the help

score 2 · Answer 1 · answered Jul 05 '24 at 14:34

In some sense, the raw values of the output can be interpreted as probabilities of class membership: 85% chance it is a dog, 10% chance it is a cat, 4% chance it is a horse, and 1% chance it is a jellyfish. However, these values may not align with the reality of event occurrence: in the above, it might be that only 55% of the time, the input will turn out to be a dog when the prediction was 85%. Indeed, many machine learning models lack calibration of the predictions, deep learning among them.

Looking at techniques to assess and remedy prediction calibration will go a long way. Fortunately, techniques do exist in Python, starting with tools within Scikit-learn, such as isotonic regression. I find the rms::val.prob and rms::calibrate functions in R software quite useful, too, and they are not exactly the same as what Scikit-learn discusses on that linked page. Understanding how they operate might be helpful.

Level of confidence for binary classification

1 Answers1