Theorem (Best Linear Prediction of $Y$ outcomes): Let $(X,Y)$ have moments of at least the second order, and let $Y'=a+bX$. Then the choices of $a$ and $b$ that minimize $Ed^2(Y,Y')=E(Y-(a+bX))^2$ are given by $$a= \left(E(Y) - \dfrac{cov(x,y)}{var(x)}\right)E(X)$$ and $$b=\dfrac{cov(x,y)}{var(x)}$$
Proof: Left to the reader.
I want to prove this theorem, so I see that this $a$ and $b$ are very similar to the case when correlation is equal $1$, except that $cov(x,y)$ is not $= std(x)std(y)$, but I can do no more.
Also, below the theorem: Now define $V=Y-Y'$ to represent deviation....Since $EY=EY'$, $EV=0$ (there is no mention of why $EV=0$) $std(Y) = var(Y')+var(V)+Cov(Y',V)$ (I get this one)
where: $var(V)=EV^2=Ed^2(Y,Y')=Ed^2(Y,a+bX)=var(Y)-\dfrac{cov(X,Y)^2}{var(X)^2}$ (Why?Why?Why? I have sat down for nearly 1 hour and can't understand this expression. Sometimes, the book exceeds too fast that I can't understand)