11

A "standard" example of Bayes Theorem goes something like the following:

In any given year, 1% of the population will get disease X. A particular test will detect the disease in 90% of individuals who have the disease but has a 5% false positive rate. If you have a family history of X, your chances of getting the disease are 10% higher than they would have been otherwise.

Virtually all explanations I've seen of Bayes' Theorem will include all of those facts in their formulation of the probability. It makes perfect sense to me to account for patient-specific factors like family history, and it also makes perfect sense to me to include information on the overall reliability of the test. I'm struggling to understand the relevance of the fact that 1% of the population will get disease X, though. In particular, that fact is presumably true for all patients who receive the test; that being the case, wouldn't Bayes' Theorem imply that the actual probability of a false positive is much higher than 5% (and that one of the numbers is therefore wrong)?

Alternatively, why doesn't the 5% figure already account for that fact? Given that the 5% figure was presumably calculated directly from the data, wouldn't Bayes' Theorem effectively be contradicting the data in this case?

  • 1
    What do you mean by "the actual probability of a false positive is much higher than 5%"? // The 5% figure means that if you take 100 people who actually do not have the disease, and give them all the test, on average 5 of them will test positive. This does not have anything to do with how prevalent the disease is in the population, because here you are only considering those people who truly do not have the disease. –  Oct 16 '18 at 16:54
  • @Rahul The number of false positives that was inferred from the data is 5%, but if we combine that with the fact that a very small percent of the population gets the disease in the first place, Bayes' Theorem predicts that we should be a lot less certain that a patient has the disease than the test would seem to indicate. Wouldn't that predict that the 5% uncertainty that the data seems to indicate is too low? After all, saying "the test says that the patient has disease X, but I don't believe that they do" is exactly equivalent to saying that the test was a false positive, right? – EJoshuaS - Stand with Ukraine Oct 16 '18 at 16:59
  • "The number of false positives that was inferred from the data is 5%" That's where you're wrong. In this sort of problem you know for a fact that the false positive rate (as defined in my comment) really is 5%. If it helps, imagine that the false positive rate was determined by comparing the test in question to a perfectly accurate but extremely expensive test which is infeasible to use in medical practice. –  Oct 16 '18 at 17:07
  • @Rahul I think I see that now - you bring up a good point. – EJoshuaS - Stand with Ukraine Oct 16 '18 at 17:08
  • Related: https://math.stackexchange.com/questions/32933/describing-bayesian-probability – Henry Oct 08 '20 at 10:10

2 Answers2

6

I believe it's commonly included because it's counterintuitive. You would expect a test with a high degree of accuracy to be right most of the time but this isn't actually the case and requires more evidence. To address this I think of it as the "error of one sample" fallacy which is to say you can't do an experiment one time and make strong conclusions, even if the experiment is well-designed.

CyclotomicField
  • 11,925
  • 1
  • 14
  • 32
  • So, the fact that it tends to be reliable for the population as a whole doesn't necessarily tell you whether the test worked for this particular patient (because they could be in the 5% of people that were a false positive, or in the 10% of people that were a false negative)? – EJoshuaS - Stand with Ukraine Oct 16 '18 at 16:48
  • 1
    @EJoshuaS Exactly. Statistics only apply to groups data, not individual data points. In this case when you're trying to figure out where the individual data point lies in the sample space so when we look at an individual we need to gather evidence about the individual to correctly classify them. The evidence that only 1% of the sample space contains individuals who have this ailment should create a great deal of skepticism when claiming any individual lies in that subspace. – CyclotomicField Oct 16 '18 at 16:58
  • That makes sense - so, the key is to look at the difference between reasoning about groups and reasoning about individuals. – EJoshuaS - Stand with Ukraine Oct 16 '18 at 17:00
  • 3
    This has nothing to do with the fact that the solution is counterintuitive. The prevalence rate (here 1%) must be included in the problem, otherwise an answer is impossible. To see this, just replace 1% prevalence with 99% prevalence. The probability of somebody having the disease if (s)he tested positive would surely change, wouldn't it? So if you didn't know the prevalence rate, how would you arrive at an answer? – Hans Engler Oct 16 '18 at 17:05
  • @HansEngler My claim that it's counter-intuitive is because when doctors have been polled in the past they almost all get this wrong even when given all the relevant information. I'm confident these studies have motivated some authors to include this example in their textbooks. – CyclotomicField Jul 07 '22 at 17:01
3

Further to user856's explanation in the comments, here's a complementary answer.

The way to frame/interpret medical tests in general is to understand them as updating one's level of certainty that the patient has the disease:

  • without a medical-test result, the disease prevalence (a measure of disease frequency) can be taken as the patient's probability of having the disease;
  • however, in the context of a medical-test result, the aforementioned probability has changed: its updated value depends not just on the disease prevalence (as before), but now also on the test's sensitivity (true positive rate) and specificity (true negative rate). In other words, our knowledge of said probability has been refined.

https://i.sstatic.net/ZPmMO.png

p:  disease prevalence and other (prior) risk factors
v:  test sensitivity
f:  test specificity
D:  Diseased
H:  Healthy
+:  Positive test result
-:  Negative test result

The abovementioned probabilities are

  1. the positive predictive value, i.e., the probability that the patient is indeed Diseased given a positive test result $$P(D|+)=\frac{P(D+)}{P(D+)+P(H+)}=\frac{pv}{pv+(1-p)(1-f)},$$
  2. the false omission rate, i.e., the probability that the patient is actually Diseased given a negative test result $$P(D|-)=\frac{P(D-)}{P(D-)+P(H-)}=\frac{p(1-v)}{p(1-v)+(1-p)f}.$$

Thus, a screening test's predictive values $P(D|+)\,$ & $\,P(H|-)$ and overall accuracy $$P(D+)+P(H-)=pv+(1-p)f$$ depend on both its technical characteristics (sensitivity and specificity) and the population that it is being used on (disease prevalence). In particular:

  • unless the test has 100% sensitivity, its number of false-negative results is proportional to the disease prevalence $p;$
  • unless the test has 100% specificity, its number of false-positive results is proportional to $(1-p).$

N.B. The OP mentions “test reliability”, but that’s a separate issue, since reliability typically refers to consistency across retakes of a test’s results.

Here is a glossary. $$\\$$ Finally, here is a concrete extended example (based on actual data, and simplistically assuming that successive tests are independent of one another) to put all this in context: enter image description here Due to the low disease prevalence,

  • the PCR and rapid tests have a positive predictive value of only $4\%$ and $17\%$ respectively,
  • whereas their negative predictive value are both almost $100\%;$

the tests' overall accuracy are $95\%$ and $99\%$ respectively.

ryang
  • 44,428
  • 1
    This is a really good point. So basically, it wouldn't be a true "update" of our belief if we didn't account for the evidence that we already knew about before running the test? – EJoshuaS - Stand with Ukraine Jan 04 '21 at 17:43
  • 1
    Exactly. While everyone remembers that medical screening tests aren't definitive/ "gold standard" (sensitivity=100%=specificity), the common fallacy is implicitly forgetting that disease prevalence isn’t 100% either! If the disease prevalence in the above COVID-19 example was 6% instead of 0.25% (migrating the context/ geographic location), then the positive predictive value $P(D|+)$ of a single PCR test would have correspondingly jumped from 4% (not very conclusive at all) to over 50%; but even this 50% PPV is a far cry from the test's 93%-95% sensitivity and specificity! – ryang Dec 09 '21 at 07:43