3

Hi guys I was reading a paper and it was left as easy to show that (for $y_k,s_k \in \mathbb R^n$ and $B_k \in \mathbb R^{n \times n}$)

$$B_{k+1} = B_k + \frac{ y_k (y_k-B_k s_k)^T+ (y_k-B_k s_k) y_k^T }{y_k^Ts_k } -\frac{\langle y_k- B_k s_k, s_k \rangle}{(y_k^Ts_k)^2 } y_k y_k^T $$

implies

$$B_{k+1} = B_k -\frac{B_ks_ks_k^TB_k}{s_k^TB_ks_k}+ \gamma y_k y_k^T$$

For $\gamma = \frac{1}{y_k^T s_k}$. Where this has the assumptions that $s_k = x_{k+1}-x_k$ and $y_k = \nabla f(x_{k+1})- \nabla f(x_{k})$. $B_k$ is symmetric positive definite matrix. Also I think we can assume $B_{k+1}s_k = y_k$

I want to show this is true: My attempt

$$B_{k+1} = B_k + \frac{ y_k y_k^T-y_ks_k^TB_k+ y_k y_k^T-B_k s_k y_k^T }{y_k^Ts_k } -\frac{\langle y_k, s_k \rangle}{(y_k^Ts_k)^2 } y_k y_k^T +\frac{\langle B_k s_k, s_k \rangle}{(y_k^Ts_k)^2 } y_k y_k^T$$

Where I just distributed and used the prop of inner product. Now

$$B_{k+1} = B_k + \frac{ y_k y_k^T-y_ks_k^TB_k+ y_k y_k^T-B_k s_k y_k^T }{y_k^Ts_k } -\frac{ y_k^Ts_k }{(y_k^Ts_k)^2 } y_k y_k^T +\frac{ s_k^T B_k s_k }{(y_k^Ts_k)^2 } y_k y_k^T$$

Using the inner product. $$B_{k+1} = B_k + \frac{ y_k y_k^T-y_ks_k^TB_k+ y_k y_k^T-B_k s_k y_k^T - y_k y_k^T}{y_k^Ts_k } +\frac{ s_k^T B_k s_k }{(y_k^Ts_k)^2 } y_k y_k^T$$

Now I am left with $$B_{k+1} = B_k + \frac{ y_k y_k^T-y_ks_k^TB_k+ -B_k s_k y_k^T }{y_k^Ts_k } +\frac{ s_k^T B_k s_k }{(y_k^Ts_k)^2 } y_k y_k^T$$

Now I am unsure how to cancel the last term because it has a square term I need to get rid off.

Kori
  • 1,530

1 Answers1

1

The first update formula is not BFGS. It is DFP update. They are different algorithms, it is not possible to rewrite one to another. However, there is a certain symmetry between them: if you look at the DFP update for the inverse Hessian $H_k=B_k^{-1}$ in the link above, you may notice that the BFGS update for $B_k$ looks very similar - you just need to replace all $B$ with $H$, $y$ with $s$ and $s$ with $y$. In this sense, it is easy to get the BFGS update, but from the inverse Hessian DFP update.

A.Γ.
  • 30,381
  • Correct this is the DFP, and I was reading the proof of it, I wanted to understand how to derive it (Not assuming it is true for the inverse and substantiate) – Kori May 10 '18 at 02:14
  • 2
    @Kori Derivation of DFP update for $B$ is here. Derivation of BFGS update for $H$ is here. Update for inverses is done via Woodbury matrix identity. It is an exercise in Nocedal, Wright book Numerical Optimization. – A.Γ. May 10 '18 at 07:22