3

Edit :Hagen von Eitzen also asks about the word "likelyhood".

1 Answers1

1

A confidence interval differs from a probability interval even though there is a confidence distribution that is in fact a probability distribution.

If you say $A$ is between $X$ and $Y$ with $90\%$ confidence, what does it mean?

Say you take a sample of $50$ observations and you use the data to compute two statistics $X$ and $Y$.

Then you independently take a sample of another $50$ observations and use the data to compute those two statistics $X$ and $Y.$

Then you independently take a sample of another $50$ observations and use the data to compute those two statistics $X$ and $Y.$

Then you independently take a sample of another $50$ observations and use the data to compute those two statistics $X$ and $Y.$

and so on.

And you can prove that $90\%$ of the time, $A$ is between $X$ and $Y.$ Then the interval from $X$ to $Y$ is a $90\%$ confidence interval for $A.$

But let us ask whether, given one particular instance where you've computed $X$ and $Y,$ you can validly claim there is a $90\%$ probability that $A$ is between $X$ and $Y.$ Several things can go wrong with that.

  • One case is when you actually somehow know the value of $A.$ If for one sample of $50$ observations, you get $X<Y<A,$ then the conditional probability that $A$ is between $X$ and $Y,$ given what you know, is $0.$
  • Another case is when you have a prior probability distribution of $A.$ Then you can find the conditional probability distribution of $A$ given your data, but if $X$ and $Y$ happen to fall in a region where $A$ is unlikely to be, then that $90\%$ confidence interval may be very different from the $90\%$ probability interval, which, in this case, you could compute.
  • Another thing that can go wrong is that the data themselves tell you that the particular interval you've got is likely to be one of the $10\%$ of cases in which $A$ is not between $X$ and $Y.$ Here's an example: Suppose $X_1,X_2$ are independent random variables that are uniformly distributed between $A-1/2$ and $A+1/2.$ Then the interval from $\min\{X_1,X_2\}$ to $\max\{X_1,X_2\}$ is a $50\%$ confidence interval for $A.$ But if you have a case in which $X_1,X_2$ differ from each other by $0.000001,$ then it is highly improbable that $A$ is between them, and if they differ by $0.999999$ then it is nearly certain that $A$ is between them.
  • A more extreme case is when the data imply that $A$ cannot possibly be between $X$ and $Y.$ An example is when $X_1,X_2,X_3$ are independent and uniformly distributed between $0$ and $A>0,$ and $\overline X = (X_1+X_2+X_3)/3$ and you use suitable multiples of $\overline X,$ say $a\overline X$ and $b\overline X,$ as endpoints of a $90\%$ confidence interval. Given all of that, it's not hard to find those numbers $a$ and $b.$ But now suppose we observe $X_1=1, X_2=2, X_3=999.$ Then $\overline X = 334$ and the bounds are $X=a\overline X$ and $Y=b\overline X,$ and if I'm not mistaken (I'll have to work out the mathematical details) the two bounds of the confidence interval are both less than $999,$ although clearly it is impossible for $A$ to be less than $999.$
Mew
  • 391
  • 2
  • 14
  • (Actually in that last case a better way to find a confidence interval is to use $a\max{X_1,X_2,X_3}$ and $b\max{X_1,X_2,X_3}$ as the bounds of the confidence interval, with values of $a$ and $b$ differing from the ones used with $\overline X.$ But the interval described is still a valid confidence interval. $\qquad$ – Michael Hardy Apr 26 '18 at 01:52
  • @a.anand : In Bayesian inference one does not speak of confidence intervals, but of credible intervals or probability intervals. And the rest of your comment is mistaken as well. – Michael Hardy Apr 26 '18 at 17:18