0

I'm going to resist talking in terms of sample/population. I have set of N observations and I'm trying to understand why its standard deviation around the mean uses a n-1 divisor.

Say my set is {1,2,3}. The mean of this set is 2. It seems to me that there are only 2 degrees of freedom for (x - mean), because the third deviation must be the -1*(sum of the other two). There are only 2 independent variables.

Edit: I understand Bessel's correction and why the SD of a sample doesn't divide by n-1. I'm asking the opposite. Why is it that the SD of a population divides by n, even though we can "derive" the last deviation using the others.

Teddy K
  • 49
  • "I'm going to resist talking in terms of sample/population." Why? They are good words to use when talking about this exact issue. – Arthur Jan 12 '20 at 00:11
  • I agree that they may be useful when talking about standard deviation for a sample/population, but I'm not sampling here. I'm only interested in the deviation of a set of observations that are known to me. – Teddy K Jan 12 '20 at 00:13
  • "A set of observations that are known to me" sounds like a sample to me. – Arthur Jan 12 '20 at 00:15
  • But just to be clear, are you asking why we would use $$\hat\sigma=\sqrt{\frac{(-1)^2+0^2+1^1}{2}}$$ for your example observations instead of $$\hat\sigma=\sqrt{\frac{(-1)^2+0^2+1^1}{3}}$$or are you asking about something else? – Arthur Jan 12 '20 at 00:19
  • @Arthur no I'm asking the opposite. My "set of observations" isn't a sample. I haven't sampled anything. I have written down a set of numbers on a piece of paper, period. I'm asking why the SD is the second formula in your comment (dividing by 3), and not the first (dividing by 2). – Teddy K Jan 12 '20 at 00:20
  • There are two reasons why we might be talking about averages and standard deviations and such... One reason might be to describe the data that we see and only the data that we see. In such a situation it is division by $n$ in the formula, not $n-1$. The other reason might be to make predictions about the random process which generated that data under the assumption that we do not fully understand why it was those numbers we saw and so we allow ourselves to think the standard deviation might be a bit bigger than otherwise thought. In such a situation we divide by $n-1$ instead... – JMoravitz Jan 12 '20 at 00:23
  • I am not interested in making predictions about the process that generated the data. I am trying to describe the data that we see and only the data that we see. – Teddy K Jan 12 '20 at 00:24
  • Then divide by $n$ and not by $n-1$. An example of when we would divide by $n$ and not $n-1$ is for example when talking about the standard deviation of the result of a fair die. An example of when we would divide by $n-1$ instead might be when polling $0.1%$ of the population on how they would vote in the next election to make a prediction as to how the result of the election will turn out. – JMoravitz Jan 12 '20 at 00:26
  • 1
    Related posts: https://stats.stackexchange.com/questions/3931/intuitive-explanation-for-dividing-by-n-1-when-calculating-standard-deviation and https://math.stackexchange.com/questions/15098/sample-standard-deviation-vs-population-standard-deviation – JMoravitz Jan 12 '20 at 00:30
  • @JMoravitz I'm trying to understand why we divide by 3. Yes, we're taking an average of 3 dispersions around the mean. But I'm confused why we're taking the average, because one of those dispersions can be "derived" by the other two. There are really only two independent dispersions around the mean. That's the part that's confusing me. – Teddy K Jan 12 '20 at 00:31
  • Wait... your complaint is that one of the terms was zero? Is that all? So... if the multiset of observations was ${1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,\dots,2,2,2,3}$ you'd still think that we should only be dividing by $2$ instead of by $1000$ or however many numbers there are there? Surely if you have a random process that $998$ times out of $1000$ produces a $2$ and only rarely produces a $1$ or $3$, you'd expect very little change from one result to the next... Of course the zeroes matter and should not be ignored....... why would you ignore them? – JMoravitz Jan 12 '20 at 00:36
  • No, my complaint isn't that one of the terms was zero. In the example you used, I'm asking why we wouldn't divide by 999 (one less than N). The reason I ask that is because it seems to me that in the set of all deviations from the mean (a set of 1000 deviations), only 999 are really independent. – Teddy K Jan 12 '20 at 00:40
  • @TeddyK You are complicating things by thinking about 'degrees of freedom' without additional context. Standard deviation of the set of observations $(x_1,x_2,\ldots,x_n)$ is defined as the root mean squared deviation about the mean, i.e. $\sqrt{\frac1n\sum_{i=1}^n(x_i-\overline x)^2}$. – StubbornAtom Jan 12 '20 at 12:40

1 Answers1

2

Without more context, there’s no reason to think about degrees of freedom. The standard deviation is simply the square root of the average squared deviation from the mean. Mathematically, there’s nothing more to say.

Kevin Carlson
  • 54,515
  • It may "simply" be that to you, but I think it's fine to ask about degrees of freedom here too. If "degrees of freedom" has a mathematical definition, I believe there could be quite a lot to say, mathematically. – Teddy K Jan 12 '20 at 00:16
  • 1
    @TeddyK What is unclear is why “degrees of freedom” is a relevant notion to think about here. It comes up in statistics because we are estimating an unknown quantity from some observations, and then the fact that the observations are not all independent affects what the best estimate of that quantity is. But that quantity itself is just an average deviation. – Kevin Carlson Jan 12 '20 at 05:26