5

I don't understand why $Var(X) = E((X-\mu)^2)$.

It's defined as the "expected value of the square of the deviation of $X$ from the mean" but I don't understand why it couldn't be $E(X-\mu)$ as that seems more intuitive for "deviation from the mean".

Is the purpose of the squaring to make deviations positive? Because if so, why not just $E(|X-\mu|)$ instead? What's the point of the squaring?

Jyrki Lahtonen
  • 140,891
user439088
  • 53
  • 3
  • 3
    It can't be $E(X-\mu)$, because $E(X-\mu)=E(X)-E(\mu)=\mu-\mu=0$ – John Doe Apr 21 '17 at 16:46
  • Intuitively, a negative deviation is as important as a positive deviation. You always want to take into account ALL deviations from the mean, regardless of the direction, and you don't want them to "add out" which would hide some of them. – MPW Apr 21 '17 at 16:50
  • See also this post on cross validated – angryavian Apr 21 '17 at 16:50
  • From what I can see, most answers apparently use circular logic, but most of it boils down to 1. Squaring results in nice properties later, and 2. Central Limit Theorem, somehow – user439088 Apr 21 '17 at 16:55
  • According to whuber's comment here Gauss started by squaring stuff and from THAT derived the normal distribution. He mentions a unique role in the CLT but does not elaborate, unfortunately. – user439088 Apr 21 '17 at 16:57

1 Answers1

2

The mean absolute deviation (wrt to the mean) $E(\vert X - \mu \vert)$ is an alternative index of variability. A variant is to look at the mean absolute deviation wrt to the median $m$, because it can be shown that the mean absolute deviation $E(\vert X - a \vert)$ wrt to a value $a$ is minimised when $a=m$.

The variance is usually preferred to the mean absolute deviation for a few reasons. A modelling is that the quadratic term penalises large deviations more than small deviations (presumably, large deviations are worse). Another one is that the square deviation is differentiable and hence easier to handle.

You can learn more about this at https://stats.stackexchange.com/questions/118/why-square-the-difference-instead-of-taking-the-absolute-value-in-standard-devia

mlc
  • 5,608
  • Note that the analogue to "$a \mapsto E(|X-a|)$ is minimized when $a$ is the median of $X$" is "$a \mapsto E[(X-a)^2]$ is minimized when $a$ is the mean of $X$." – angryavian Apr 21 '17 at 17:02
  • What role does squaring play in the central limit theorem? – user439088 Apr 21 '17 at 17:03