8

It's not an exercise for uni or anything like that, just something that's been bothering me a bit and I can't seem to find useful information on the web on the matter.

When talking about real valued scalar functions, we know that newton's method will surely converge to a root $s$ of $f(x)$ if our initial value $x_0$ is sufficiently close to the root, meaning, on the interval $(s-r,s+r)$ where $r=|s-x_0|$ the derivative $f'(x)$ is never zero (assuming $f'(s)$ is not zero).

My question is, can we expand that criterion to higher dimension?

Let $f: \mathbb R^{n} \to \mathbb R^n$ differentiable function with continuous partial derivatives. Let $s\in \mathbb R^n$ such that $f(s)=0$ and let $x_0 \in \mathbb R^n$ and $r=|s-x_0|$.

Assume the jacobian of $f$ is invertible everywhere in the sphere with radius $r$ at epicenter $s$. Prove or disprove that newton's method will converge to $s$ if our initial value was $x_0$.

Reminder:

Newton's method is defined as $x_{n+1}=x_n-J^{-1}(x_n)f(x_n)$

Oria Gruber
  • 13,035
  • 1
    @Arthur What do you mean by $\nabla f$? It is a function $f: \mathbb{R}^n \to \mathbb{R}^n$. – Pedro M. Feb 20 '15 at 16:11
  • start with http://math.stackexchange.com/questions/457903/newtons-method-in-higher-dimensions-explained?rq=1 – Will Jagy Feb 20 '15 at 16:26
  • I do not think your conditions for the convergence of the scalar case are sufficient. That is, it's not enough to say that $f'(x)\neq 0$ on $(s-r,s+r)$. There needs to be some sort of regularity condition on $f'$, for instance $|f(x)f''(x)|<(f'(x))^2$ for all $x\in(s-r,s+r)$. That's not necessarily going to be true for any starting point $x_0$ and $r$. A similar regularity condition will be required in the multidimensional case as well; I think D. Thomaine's pair of conditions on $J$ and $J^{-1}$ serve that purpose. – Michael Grant Feb 20 '15 at 17:03
  • 1
    @Michael Grant: the "differentiable everywhere with continuous partial derivatives" implies that $f$ is $\mathcal{C}^1$, which is needed for this proof (thanks to this, I know that these conditions on $J$ and $J^{-1}$ are satisfied on a neighborhood of $x_0$). I think that the assumption that $f$ be $\mathcal{C}^1$ should also be satisfied in dimension $1$, in which case you are right: being differentiable is not enough. There is no need to summon the second derivative, though. – D. Thomine Feb 20 '15 at 20:32

1 Answers1

1

This generalization of Newton's method works as well.

Let $\varepsilon > 0$ to be chosen later. Let $\delta$ be such that $\|J(x)-J(s)\| \leq \varepsilon$ and $\|J^{-1}(x)-J^{-1}(x_0)\| \leq \varepsilon$ on a $\delta$-neighborhood of $x_0$.

For $x \in \overline{B}(s, \delta)$, let $T(x) := x-J^{-1}(x)f(x)$. This is well-defined if $\varepsilon < \|J^{-1}(s)\|$. Then :

$$f(x) = \int_0^1 J(x_0+t(x-s)) \cdot (x-s) \ dt.$$

Let me write $J_t (x) := J(s+t(x-s))$. Then :

$$ \begin{align} J^{-1}(x)f(x) & = \int_0^1 J^{-1}(x) \cdot J_t(x) \cdot (x-s) \ dt \\ & = x-s + \int_0^1 [J^{-1}(x) \cdot J_t(x) - I]\cdot (x-s) \ dt, \end{align} $$

$$ \begin{align} \|T(x)-s\| & = \left\| \int_0^1 [J^{-1}(x) \cdot J_t(x) - I]\cdot (x-s) \ dt \right\| \\ & \leq \left( \max_{t \in [0,1]} \left\| J^{-1}(x) \cdot J_t(x) - I \right\| \right) \cdot \|x-s\|. \end{align} $$

But, for all $x$ and $y$ in $\overline{B}(s, \delta)$, we know that $\|J(x)-J(y)\| \leq 2 \varepsilon$. Hence,

$$\max_{t \in [0,1]} \left\| J^{-1}(x) \cdot J_t(x) - I \right\| \leq \|J^{-1}(x)\| \cdot \left( \max_{t \in [0,1]} \left\| J_t(x) - J(x) \right\| \right) \leq 2 \varepsilon \left(\|J^{-1}(s)\|+\varepsilon \right).$$

Now, choose $\varepsilon < \|J^{-1}(s)\|$ such that the right hand side is, say, at most $1/2$. Note that it implies that $T$ maps $\overline{B}(s, \delta)$ into itself, so we can iterate $T$ as many times as we want. Then for all $x$ in $\overline{B}(s, \delta)$, for all $n \geq 0$:

$$\|T^n (x)-s \| \leq \delta 2^{-n},$$

so Newton's method converges at least at (actually faster than) exponential speed.

D. Thomine
  • 11,228