4

I am running a model where it generates song detections with a confidence value. I then validate it across an annotated dataset. I then plot the values of TPR and FPR at each confidence threshold, starting with 0 till 1 with a stepping of 0.01. This is my ROC curve. The model FPR never goes beyond 0.03. So, should I calculate the AUROC by extrapolating either extreme points to (0,0) and (1,1), or should it just be between the points that I have? Because this particular model shows high TPR at a low FPR. I am not sure how to interpret it.Model ROC - calculated the AUC from the inbuilt function in R.

1 Answers1

4

If your ROC curve doesn't pass by the point (1, 1) [and (0, 0)], there is an error in the way the curve is computed.

By definition the predictions whose "proba" are below a chosen threshold are considered of the negative class and the predictions above the threshold considered of the positive class.

The proba can vary from 0 to 1 and so the threshold. The ROC curve is the ensemble of points of coordinates (FPR, TPR) for all the thresholds possible (so between 0 and 1).

Let's imagine that you set the threshold to 0. It means that there is no sample predicted as negative so FN = TN = 0. Then,
FPR = FP / (FP + TN) = 1
TPR = TP / (TP + FN) = 1
So the ROC curve always pass by the point (1, 1).

Idem if you take a threshold of 1 then the model always predict the negative class. As there is no positive prediction, FP = TP = 0. So FPR = 0 and TPR = 0 and we can conclude that the ROC curve will also always pass by the point (0, 0).

rehaqds
  • 1,801
  • 4
  • 13