According to frequentists, why can't probabilistic statements be made about population paramemters?

Question

I recently read Wikipedia's entry on Confidence Intervals:

https://en.wikipedia.org/wiki/Confidence_interval

There are a few statements that I have trouble understanding:

The confidence interval can be expressed in terms of samples (or repeated samples): "Were this procedure to be repeated on multiple samples, the calculated confidence interval (which would differ for each sample) would encompass the true population parameter 90% of the time."[1] Note that this does not refer to repeated measurement of the same sample, but repeated sampling.

And:

The confidence interval can be expressed in terms of a single sample: "There is a 90% probability that the calculated confidence interval from some future experiment encompasses the true value of the population parameter." Note this is a probability statement about the confidence interval, not the population parameter.

And:

A 95% confidence interval does not mean that for a given realised interval calculated from sample data there is a 95% probability the population parameter lies within the interval, nor that there is a 95% probability that the interval covers the population parameter.[11] Once an experiment is done and an interval calculated, this interval either covers the parameter value or it does not; it is no longer a matter of probability.

I realize that according to frequentist methods, the value of the parameter is, as the quotes above indicate, a fixed value and not a random variable. It is the sample data that is random.

I also know that often in the "real world", systems that are deterministic can be modeled as probabilistic if there is not enough information about the system itself. An example of this would be the tossing of a coin. How the coin lands is a deterministic process. It is only because there are so many variables that we model it is a probabilistic system. In other words, in this case, we "paper over" our ignorance of the details of the system with a probabilistic model.

I realize that in a population with an unknown parameter, according to frequentist methods, this value is not a random variable. Yet, we also know that if we calculate a confidence interval in a certain way, we produce an interval that encompasses the true population parameter 90% of the time. So then why can't we paper over our ignorance of the value and say "There is a 90% probability the population parameter lies within this interval"? After all, as we just said, 90% of the time we calculate a confidence interval, the true population parameter will lie within the interval.

So even though the population parameter is not a random variable, since we don't know what it is, why can't we make a probabilistic statement about it?

score 0 · Answer 1 · answered Sep 05 '16 at 16:03

0

Suppose that you want to model the random behaviour of a certain population. Then you have to associate to the population a density function $f$ (that is, you choose a "normal distribution", "exponential distribution", etc.), and a parametre $\theta$ (that is, if for example your density is a normal, then $\theta$ can be the population mean or the variance, etc.).

Suppose that you have decided which $f$ you want, that is, the distribution for your population. The goal now is to estimate $\theta$. In frequentist statistics, $\theta$ is an unknown contant to be discovered. That is why we speak about confidence and not about probability.

Example: imagine I want to model the height of the people in England. I associate to it the normal distribution, so $f$ is the density function of a normal. Now I want to estimate $\mu=\text{population mean}$. One takes a sample $X_1,\ldots,X_n$ of heights and uses the fact that $$ \frac{\bar{X_n}-\mu}{s_n/\sqrt{n}}\sim t_{n-1}. $$ One computes $a$ and $b$ so that $$ P\left(a<\frac{\bar{X_n}-\mu}{s_n/\sqrt{n}}<b\right)=0.95, $$ that is, $$ P\left(\bar{X_n}-a\cdot s_n/\sqrt{n}<\mu<\bar{X_n}-b\cdot s_n/\sqrt{n}\right)=0.95. $$ Here it makes sense to speak about probability because $\bar{X_n}$ is a random variable. Now, what you do is to substitute $\bar{X_n}$ (random variable) by the sample mean $\bar{x_n}$ (constant value), and your confidence interval is $$ I=[\bar{x_n}-a\cdot s_n/\sqrt{n},\bar{x_n}+b\cdot s_n/\sqrt{n}]. $$ The parametre $\mu$ is a constant, so either it belongs to $I$ or not (you do not have probability here). But you have a lot of confidence that it will belong to $I$.

Remark: opposite to frequentist statistics, one may use bayesian statistics, which assumes that the parametre $\theta$ is a random variable, with a probability distribution to be discovered. In this case one speaks about credible regions (probabilities) and not confidence intervals (confidence).

answered Sep 05 '16 at 16:03

Now let's say that someone did this and obtained a 90% confidence interval. I now make a wager with you as to whether the true value is in the interval (let's say that the true value can be looked up somewhere, and can be verified). But we'll play it a little differently: you get to set the payout odds, but I get to decide which side of the bet to take. E.g.: If you set the payout at 100:1 then I will decide to take the side that bets that the true value of the parameter is in the interval. If I win, I receive 100, else I lose 1. So what odds would you choose? And what does that imply? – Israel Sep 05 '16 at 16:20
You have the confidence that $90%$ of times you will receive 100 and $10%$ of the times you will receive $1$. You do not speak about probability, because the parametre is a constant, which either lies or does not lie in the interval. You have confidence that the parametre will be in the interval. You would speak about probability if the parametre were a random variable. – Sep 05 '16 at 16:29
I'm not a statistician, but I think that if you read about bayesian statistics, it would be clearer :) – Sep 05 '16 at 16:30
If 90% of the the time I win 100 and 10% I lose 1 then my expected winnings are: 90% * 100 + 10% * -1 = 89.9. In this formula, 90% represents the probability that I win the bet. But if winning the bet is contingent on the parameter being in the interval, then doesn't it follow that there is a 90% chance the parameter is in the interval? – Israel Sep 05 '16 at 16:37
But "you have the confidence that $90%$ of the times you will receive 100" $\neq$ "the probability that you receive 100 is $0.9$". In the first case, the fact that you win or not is previously fixed by the fact that the constant parametre is in the interval or not, so there is no randomness, just confidence. In the second case (the one you describe on your previous comment), the fact that you win behaves as a random variable and therefore there is randomness on your parametre. – Sep 05 '16 at 16:59
I see what you're saying, but it just seems to me that the difference between confidence and probability is only a semantic distinction. If I roll a die and cover it up before looking at it, then you could argue as you did above, that there is no randomness, either the 6 came up or not. However, aren't I fully justified in saying that the probability that the 6 came up is 1 in 6? – Israel Sep 05 '16 at 17:09
The distinction is not semantic. In the case of a die, you can define $X$ as the random variable that gives the number after rolling the die. Then it makes sense to write $$P(\underbrace{X}{\text{random var}}=6).$$ But does it make sense for you to write $$P(\underbrace{\bar{x_n}-a\cdot s_n/\sqrt{n}}{\text{constant}}<\underbrace{\theta}{\text{constant}}<\underbrace{\bar{x_n}+b\cdot s_n/\sqrt{n}}{\text{constant}})=0.9,?$$ – Sep 05 '16 at 17:33
The idea is that, although the process of obtaining the sample and $\bar{x_n}$ is random, you consider the sample fix and constant, and therefore $\bar{x_n}$ constant. I had the same questions as you when I started learning about frequentist statistics, and the questions that arise are completely normal because frequentist statistics entails "contradictions". Because of this, bayesian statistics appeared. In the bayesian setting the parametres are treated as random variables and you work with "probability" (an object well-defined mathematically), and not with "confidence" and other vague words – Sep 05 '16 at 18:58
Okay, I can see why according to frequentist methods one can't speak about the parameter being random. But the sample data that we obtain can be viewed as random. And if the data is random, then the interval is random. And if this is the case, if we can't say "there is a 90% chance that the parameter lies within the interval", can we say "there is a 90% chance that the interval contains the parameter? After all, we do know that 9 out of 10 times it will. – Israel Sep 06 '16 at 14:34
@Israel. The parametre $\theta$ is constant. Take a sample and construct the confidence interval $I_1$. Take another sample and construct $I_2$. Continue until $I_{100}$. Then you have the confidence that $\theta$ will be in $90$ out of those $100$ intervals. I do not see why that is the same as saying that the probability that $\theta$ is in $I_1$ is $0.9$. You have that $\theta$ either lies or does not lie in $I_1$, because $\theta$ is constant. But, intuitively, you have a lot of confidence that $\theta$ will be in $I_1$. – Sep 06 '16 at 15:52
In your previous comment, if I replace "interval" with "egg" and θ with "prize", then I can rewrite your comment as: "You have the confidence that the prize will be in 90 out of those 100 eggs. I do not see why that is the same as saying that the probability that the prize is in this given egg is 0.9. You have that the prize either lies or does not lie within this given egg." If I write it this way, then your comment doesn't make sense. I have a hard time seeing why the interval and egg problems aren't equivalent. You've randomly selected something which has a 0.9 chance of "success". – Israel Sep 06 '16 at 16:47
In either the interval or the egg problem, you would select 1 to 9 payout odds if there is a "success", so in my book they are functionally equivalent. Yes, I realize that mathematically we have defined the parameter not as a random variable. Yet I can't help but return to the fact that "informally" you can treat both problems identically. If a frequentist would select 1 to 9 payout odds (and who wouldn't), then they are treating the probability of finding the parameter in the interval as 0.9 whether they like it or not. – Israel Sep 06 '16 at 16:55
@Israel. But when I say "the confidence that $\theta$ will be in 90 out of those 100 intervals", that is something intuitive, it is not something defined mathematically. When you write $$P(\bar{X_n}-a\cdot S_n/\sqrt{n}<\mu<\bar{X_n}+b\cdot S_n/\sqrt{n})=0.9,$$ that is mathematically and completely correct, but the process of substituting $\bar{X_n}$ by $\bar{x_n}$ is not justified, is just intuition, and you express that fact with the word "confidence" and not with "probability", because probability has to do with random variables, and $\mu$ is not. – Sep 06 '16 at 18:41
Maybe you should ask this to other people. I don't know how to explain this better (or if I understand this properly :) ). – Sep 06 '16 at 18:43
@Israel. I think I finally understand. I think you can say $P(\text{a constructed interval contains }\mu)=0.9$, but that is not the same as: fixed an interval of confidence $I$, $P(\mu\in I)=0.9$. – Sep 06 '16 at 21:21
What you say makes sense to me, but how do you reconcile that with the following statement from Wikipedia's article on Confidence Intervals: "A 95% confidence interval does not mean that for a given realised interval calculated from sample data . . . there is a 95% probability that the interval covers the population parameter. Once an experiment is done and an interval calculated, this interval either covers the parameter value or it does not; it is no longer a matter of probability." – Israel Sep 07 '16 at 16:53
@Israel. Precisely what Wikipedia says is what I defend. For me, it makes sense $P(\text{a constructed interval contains }\mu)=0.9$, but it does not make sense, for a fixed confidence interval $I$, to write $P(\mu\in I)=0.9$. Example: imagine that $\mu=2$, and you compute three confidence intervals: $I_1=[1,3]$, $I_2=[1.5,3.5]$ and $I_3=[3,5]$. It is correct to say that $P(\text{a constructed interval contains }\mu)=2/3$ (Laplace's rule), but it does not make sense $P(2\in I_1)=2/3$ or $P(2\in I_3)=2/3$, because either $2$ belongs to $I_j$ or not (it is no longer a matter of probability). – Sep 07 '16 at 19:43
I think I used the ellipses poorly. Let me requote it: A 95% confidence interval does not mean that for a given realised interval calculated from sample data there is a 95% probability the population parameter lies within the interval, nor that there is a 95% probability that the interval covers the population parameter. Once an experiment is done and an interval calculated, this interval either covers the parameter value or it does not; it is no longer a matter of probability. From this, it would seem that you can't say P(\text{a constructed interval contains }\mu)=0.9 – Israel Sep 07 '16 at 19:56
@Israel. I don't really know. I think that saying $P(\text{a constructed interval contains }\mu)=0.9$ is correct, and saying that, for a fixed $I$, $P(\mu\in I)=0.9$ is not. In Wikipedia it is stated: "If the polling were repeated a large number of times (you could produce a 95% confidence interval for your polling confidence interval), each time generating about a 95% confidence interval from the poll sample, then 95% of the generated intervals would contain the true percentage of voters who intend to vote for the given party." That is, $P(\text{a constructed interval contains }\mu)=0.95$. – Sep 08 '16 at 10:01
Continuation: "Each time the polling is repeated, a different confidence interval is produced; hence, it is not possible to make absolute statements about probabilities for any one given interval." That is, fixed $I$, it is not correct $P(\mu\in I)=0.95$. I don't know more than this. Try to ask this question in a statistics site, and tell me what you understand in the future. – Sep 08 '16 at 10:01

According to frequentists, why can't probabilistic statements be made about population paramemters?

1 Answers1

Linked