1

Consider the following simple linear regression model involving the $\epsilon_i$ error term,

$$y_i = \alpha + \beta x_i + \epsilon_i$$

such that,

$$\epsilon_i \sim \mathcal N(0,\sigma^2)$$

we know that if $Z_1, Z_2, ..., Z_n$ are independent, standard normal random variables,

$$Z_i \sim \mathcal N(0,1)$$

then the sum of their squares is distributed according to the chi-square distribution with $n$ degrees of freedom,

$$\sum _{i=1}^{n} Z_{i}^{2} \sim \chi^{2}_{n}$$

Now let's prove that $\frac{(n-2)s^2}{\sigma^2}\sim \chi^{2}_{n-2}$ such that,

$$s^2 = \frac{\sum _{i=1}^{n} \hat{\epsilon}_i^2}{n-2}$$

Proof:

$\require{cancel}$

$$\epsilon_i \sim \mathcal N(0,\sigma^2)$$

$$\frac{\epsilon_i - \cancelto{0}{\overline{\epsilon}}}{\sigma} \sim \mathcal N(0, 1)$$

$$\sum _{i=1}^{n} \left(\frac{\epsilon_i}{\sigma}\right)^2 \sim \chi^{2}_{n}$$

$$\sum _{i=1}^{n} \frac{\epsilon_i^2}{\sigma^2} \sim \chi^{2}_{n}$$

How do I now move from this expression $\sum _{i=1}^{n} \frac{\epsilon_i^2}{\sigma^2}$ which uses the error term $\epsilon_i$ to this expression $\sum _{i=1}^{n} \frac{\hat{\epsilon}_i^2}{\sigma^2}$ which uses the residual term $\hat{\epsilon}_i$?

1 Answers1

1

I will assume you meant $\varepsilon_i$ are intended to be independent, although you omitted that.

It is unclear, to say the least, what you meant by $\overline \varepsilon_i.$ If you meant $\overline\varepsilon = (\varepsilon_1+\cdots+\varepsilon_n)/n$ then the subscript $i$ is meaningless. And if that's what you meant, then there's no reason to think it's $0.$ In fact you would have $\overline\varepsilon \sim \operatorname N(0, \sigma^2/n).$

Where you wrote $\displaystyle s^2 = \frac{\sum_{i=1}^n \hat{\varepsilon}_i}{n-2}$ I'll assume that's a typo and you meant $ \displaystyle s^2 = \frac{\sum_{i=1}^n \hat{\varepsilon}_i^2}{n-2}.$

You have $$ y = \left[ \begin{array}{c} y_1 \\ \vdots\phantom{_1} \\ y_n \end{array} \right] = \alpha \left[ \begin{array}{c} 1 \\ \vdots \\ 1 \end{array} \right] + \beta \left[ \begin{array}{c} x_1 \\ \vdots\phantom{_1} \\ x_n \end{array} \right] + \left[ \begin{array}{c} \varepsilon_1 \\ \vdots\phantom{_1} \\ \varepsilon_n \end{array} \right] $$ and $$ \widehat y = \left[ \begin{array}{c} \widehat y_1 \\ \vdots\phantom{_1} \\ \widehat y_n \end{array} \right] = \widehat\alpha \left[ \begin{array}{c} 1 \\ \vdots \\ 1 \end{array} \right] + \widehat\beta \left[ \begin{array}{c} x_1 \\ \vdots\phantom{_1} \\ x_n \end{array} \right] = \left[ \begin{array}{l} \text{the orthogonal projection of} \\ y \text{ onto the 2-dimensional space} \\ \text{spanned by the two columns} \end{array} \right] $$ and $$ \widehat\varepsilon = \left[ \begin{array}{c} \widehat \varepsilon_1 \\ \vdots\phantom{_1} \\ \widehat \varepsilon_n \end{array} \right] = \left[ \begin{array}{l} \text{the orthogonal projection of } y \text{ onto the} \\ \text{$(n-2)$-dimensional orthogonal complement} \\ \text{of the span of the two columns} \end{array} \right] $$ Observe that $\widehat\varepsilon$ has expected value $0$ and that the normal distribution of $\varepsilon$ is spherically symmetric, i.e. depends on $\varepsilon$ only through the norm $\|\varepsilon\| = \sqrt{\varepsilon_1^2+\cdots + \varepsilon_n^2}$.

Therefore, in the aforementioned $(n-2)$-dimensional space, you've got the components of the vector $\widehat\varepsilon$ in directions orthogonal to each other, each normally distributed with expected value $0$ and variance $\sigma^2.$