My teacher wanted us to try to attempt to prove this. So I noticed the summation on the left represents SST (total sum of squares) and on the right I noticed the second summation was the measure in variability of the y's in the linear regression term. However what is that first summation? Also how can I manipulate the right side to get the left side?
-
1What is $\bar{Y}_i$? – David May 05 '16 at 14:54
-
I'm honestly very confused.. because that first summation poses very similar to Yi with ^ but my teacher wrote a bar instead. Do you think it was an error? – Lil May 05 '16 at 14:56
-
1Yeah usually it is written $\sum_{i} (Y_i - \bar{Y})^2 = \sum_i (Y_i - \hat{Y}_i)^2 + \sum_i (\hat{Y}_i - \bar{Y})^2$ i.e. SST = SSR + SSE. Check out https://en.wikipedia.org/wiki/Coefficient_of_determination – David May 05 '16 at 15:01
-
yeah that's what I have noticed from online... this is to represent anova, correct? – Lil May 05 '16 at 15:01
-
http://www.robots.ox.ac.uk/~fwood/teaching/W4315_Fall2010/Lectures/lecture_6/lecture_6.pdf So I'm following the solution to the proof here. Do you know how they got the line after they factored the summation? – Lil May 05 '16 at 15:20
-
@Lil Does the answer help or not ? – callculus42 May 05 '16 at 19:14
2 Answers
$\sum_{i=1}^n(y_i-\overline y)^2=\sum_{i=1}^n((\hat y_i-\overline y)+(y_i - \hat y_i))^2$
$=\sum_{i=1}^n((\hat y_i-\overline y)^2+2(\hat y_i-\overline y)(y_i-\hat y_i)+(y_i-\hat y_i)^2)$
$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat y_i-\overline y)$
$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n(y_i-\hat y_i)(\hat\beta_0+\hat\beta_1x_{i1}+\hat\beta_2x_{i2}+...+\hat\beta_mx_{im}-\overline y)$
Now let $\hat u_i=y_i-\hat y_i$
$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2\sum_{i=1}^n\hat u_i(\hat\beta_0+\hat\beta_1x_{i1}+\hat\beta_2x_{i2}+...+\hat\beta_mx_{im}-\overline y)$
$\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2+2(\hat\beta_0-\overline y)\cdot \sum_{i=1}^n \hat u_i+2\hat\beta_1 \sum_{i=1}^n \hat u_ix_{i1}+2\hat\beta_2 \sum_{i=1}^n \hat u_ix_{i2}+...+2\hat\beta_m \sum_{i=1}^n \hat u_ix_{im}$
It is $\sum_{i=1}^n \hat u_i=0$ and $\sum_{i=1}^n \hat u_ix_{ij}=0 \ \ \forall j=1,2,...,m$
Finally it becomes
$\sum_{i=1}^n(y_i-\overline y)^2=\sum_{i=1}^n(\hat y_i-\overline y)^2+\sum_{i=1}^n(y_i-\hat y_i)^2$
- 31,012
-
1In the proof, before "Finally";
Sigma1n Uihat=0has a proof here:http://math.stackexchange.com/questions/494181/why-the-sum-of-residuals-equals-0-when-we-do-a-sample-regression-by-ols I could not see why the latter holds: For allj=1,...,.mSigma1n UihatXij=0– Erdogan CEVHER Nov 21 '16 at 08:26 -
1Now, I got it. In the same link, the partial derivative wrt to b gives the second fact above, i.e., For all
j=1,...,.mSigma1n UihatXij=0. It would be better the answer included the proofs of these 2 facts as well. – Erdogan CEVHER Nov 21 '16 at 08:48 -
1@ErdoganCEVHER It seems that it wasn´t required, otherwise the OP would have asked for it. With you hints an interested reader can have a look to it. Anyway your comments are helpful. – callculus42 Nov 21 '16 at 09:13
Here's an elementary proof that SST = SSR + SSE in the case of simple linear regression.
First recall the estimators for slope and intercept are $$(1)\quad\hat\beta_1:=\frac{\sum(y_i-\bar y)(x_i-\bar x)}{\sum(x_i-\bar x)^2}\qquad(2)\quad\hat\beta_0:=\bar y - \hat\beta_1\bar x$$ respectively, and the $i$th predicted response is $$\hat y_i:=\hat\beta_0+\hat\beta_1x_i\stackrel{(2)}=\bar y + \hat\beta_1(x_i-\bar x).\tag3$$ Now compute $$\begin{aligned} SST&:=\sum (y_i-\bar y)^2=\sum\left[(y_i-\hat y_i) + (\hat y_i-\bar y)\right]^2\\ &=\underbrace{\sum(y_i-\hat y_i)^2}_{SSE} +2\sum(y_i-\hat y_i)(\hat y_i-\bar y)+\underbrace{\sum(\hat y_i-\bar y)^2}_{SSR}. \end{aligned}$$ The middle term must equal zero. Indeed, substitute (3) twice and do some algebra: $$\begin{aligned} \sum(y_i-\hat y_i)(\hat y_i-\bar y)& \stackrel{(3)}=\sum\left(y_i-[\bar y+\hat\beta_1(x_i-\bar x)]\right)\cdot\hat\beta_1(x_i-\bar x)\\ &=\sum\left([y_i-\bar y]-\hat\beta_1(x_i-\bar x)\right)\cdot\hat\beta_1(x_i-\bar x)\\ &=\hat\beta_1\sum(y_i-\bar y)(x_i-\bar x) -\hat\beta_1^2\sum(x_i-\bar x)^2 \end{aligned} $$ But this last quantity is zero because $\sum(y_i-\bar y)(x_i-\bar x)=\hat\beta_1\sum(x_i-\bar x)^2$ by (1).
- 40,909
-
Elegant proof, I was sure it was possible to do it simply without using some orthogonal properties. – Adrien Portier Sep 26 '24 at 08:57
-
a little interesting to combine this with the derivation of the slope estimator using least squares as well: https://www.amherst.edu/system/files/media/1287/SLR_Leastsquares.pdf – howard Apr 06 '25 at 21:29
