Why was the word “confidence” introduced into statistical practice? What does it convey that “probability” does not?

Question

Edit :Hagen von Eitzen also asks about the word "likelyhood".

I guess the short answer is that, when we compute a confidence interval, then the actual mean either lies in the interval or it doesn't; there's no probability involved. This is expanded on a bit in the accepted answer to this question. — Mark McClure, Apr 25 '18 at 10:18
iIn natural distribution were are naturally confident to within $\pm 1 \sigma $ limits for example.. If the occurence is way outside $ 3 \sigma$ we lost confidence to getting or capturing a thing of desired interest the way we expected. — Narasimham, Apr 25 '18 at 18:14
If this question is to be closed, it should be by migrating it to stats.stackexchange.com . The question that it poses is important. — Michael Hardy, Apr 26 '18 at 01:46

score 1 · Answer 1 · edited Jan 22 '20 at 13:51

A confidence interval differs from a probability interval even though there is a confidence distribution that is in fact a probability distribution.

If you say $A$ is between $X$ and $Y$ with $90\%$ confidence, what does it mean?

Say you take a sample of $50$ observations and you use the data to compute two statistics $X$ and $Y$.

Then you independently take a sample of another $50$ observations and use the data to compute those two statistics $X$ and $Y.$

and so on.

And you can prove that $90\%$ of the time, $A$ is between $X$ and $Y.$ Then the interval from $X$ to $Y$ is a $90\%$ confidence interval for $A.$

But let us ask whether, given one particular instance where you've computed $X$ and $Y,$ you can validly claim there is a $90\%$ probability that $A$ is between $X$ and $Y.$ Several things can go wrong with that.

One case is when you actually somehow know the value of $A.$ If for one sample of $50$ observations, you get $X<Y<A,$ then the conditional probability that $A$ is between $X$ and $Y,$ given what you know, is $0.$
Another case is when you have a prior probability distribution of $A.$ Then you can find the conditional probability distribution of $A$ given your data, but if $X$ and $Y$ happen to fall in a region where $A$ is unlikely to be, then that $90\%$ confidence interval may be very different from the $90\%$ probability interval, which, in this case, you could compute.
Another thing that can go wrong is that the data themselves tell you that the particular interval you've got is likely to be one of the $10\%$ of cases in which $A$ is not between $X$ and $Y.$ Here's an example: Suppose $X_1,X_2$ are independent random variables that are uniformly distributed between $A-1/2$ and $A+1/2.$ Then the interval from $\min\{X_1,X_2\}$ to $\max\{X_1,X_2\}$ is a $50\%$ confidence interval for $A.$ But if you have a case in which $X_1,X_2$ differ from each other by $0.000001,$ then it is highly improbable that $A$ is between them, and if they differ by $0.999999$ then it is nearly certain that $A$ is between them.
A more extreme case is when the data imply that $A$ cannot possibly be between $X$ and $Y.$ An example is when $X_1,X_2,X_3$ are independent and uniformly distributed between $0$ and $A>0,$ and $\overline X = (X_1+X_2+X_3)/3$ and you use suitable multiples of $\overline X,$ say $a\overline X$ and $b\overline X,$ as endpoints of a $90\%$ confidence interval. Given all of that, it's not hard to find those numbers $a$ and $b.$ But now suppose we observe $X_1=1, X_2=2, X_3=999.$ Then $\overline X = 334$ and the bounds are $X=a\overline X$ and $Y=b\overline X,$ and if I'm not mistaken (I'll have to work out the mathematical details) the two bounds of the confidence interval are both less than $999,$ although clearly it is impossible for $A$ to be less than $999.$

(Actually in that last case a better way to find a confidence interval is to use $a\max{X_1,X_2,X_3}$ and $b\max{X_1,X_2,X_3}$ as the bounds of the confidence interval, with values of $a$ and $b$ differing from the ones used with $\overline X.$ But the interval described is still a valid confidence interval. $\qquad$ — Michael Hardy, Apr 26 '18 at 01:52
@a.anand : In Bayesian inference one does not speak of confidence intervals, but of credible intervals or probability intervals. And the rest of your comment is mistaken as well. — Michael Hardy, Apr 26 '18 at 17:18

Why was the word “confidence” introduced into statistical practice? What does it convey that “probability” does not?

1 Answers1