4

I'm looking at the classical implicit function theorem in Banach spaces. So $X,Y,Z$ are Banach spaces and $F: U_{x_0}\times V_{y_0} \to Z$ continuous and continuously differentiable with respect to y. And the inverse linear operator of the Frechet partial derivative is bounded linear operator. Also $F(x_0,y_0)=O.$ Then locally there is unique implicit function $T$ which $F(x,Tx)=O.$ This $T$ is continuous.

However if we assume additional smoothness condition for $F,$ then $T$ also has it. Ofcouse formal differentiation of $F(x,Tx) = O$ gets us that if $T$ is Frechet differentiable, then $T'x = - F'_y(x,Tx)^{-1} F'_{x}(x,Tx)$ possibly on smaller ball. So If I assume additionally that $F$ is continuously differentiable, then im trying to prove that indeed $T$ is differentiable by trying to see that $||\omega(x,h)||:=||T(x+h)-Tx+F'_y(x,Tx)^{-1} F'_{x}(x,Tx)h|| = o(||h||)$ as $h \to O.$ And so trying to get to $||\omega(x,h)||\leqq c \varepsilon ||h||.$

But the bound I'm currently getting is something along those lines: ($\overline{x}:=x+h, y:=Tx, \overline{y}:=T(x+h))$

\begin{align*} ||\omega(x,h)||&= ||-F'_y (x,y)^{-1}||_{L(Z,Y)}.||-F'_y(x,y)(\overline{y}-y)-F'_x(x,y)h|| \\ &\leq c_1 ||F(x,y)-F(x,\overline{y})+\nu(x,\overline{y}-y)+F(x,y)-F(\overline{x},y)+\mu(h,x)|| \\ &\leq c_1 ( \varepsilon ||\overline{y}-y|| + \varepsilon ||h|| + ||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)||), \end{align*} provided that $||h|| < \delta(\varepsilon), ||\overline{y}-\overline{y}|| < \delta(\varepsilon).$ Where $c_1:=||-F'_y (x,y)^{-1}||_{L(Z,Y)}.$

Now the last term in the norm I evaluate the same way with the partial Frechet derivatives \begin{align*} ||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)|| \leqq ||F'_y(x,y)||. ||\overline{y} - y|| +\varepsilon ||\overline{y}-y|| +||F'_x(x,y)||.||h||+\varepsilon .||h||. \end{align*} Denoting $c_2:=||F'_y(x,y)||_{L(Y,Z)}$ and $c_3:=||F'_x(x,y)||_{L(X,Z)}.$ This then reads $||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)|| \leqq (c_2 + \varepsilon)||\overline{y}-y|| + (c_3 +\varepsilon)||h||$ and in substituing in the inequality for $||\omega(x,h)||$ we get $$ || \omega(x,h)|| \leqq c_1 (2\varepsilon+c_2)||\overline{y}-y||+(2\varepsilon+c_3)||h||. $$ Bound for $||\overline{y}-y||$ will get with this inequality and combined with reversed triangle inequality for the definition of omega : $$ | ||\overline{y}-y|| - ||-F'_y(x,y)^{-1}F'_x(x,y)||.||h|| | \leq ||\omega(x,h)||\leqq c_1 [(2\varepsilon+c_2)||\overline{y}-y||+(2\varepsilon+c_3)||h||]. $$ Therefore denoting $c_4:=||-F'_y(x,y)^{-1}F'_x(x,y)||=c_1.c_3$ we get $(1-c_1(2\varepsilon+c_2))||\overline{y}-y|| \leq (c_1 c_3 +c_1 (2\varepsilon +c_2))||h||.$ Thus getting the following absurd (useless)bound $$ ||\omega(x,h)|| \leqq \left[ (2\varepsilon+c_3) + \frac{c_1(2\varepsilon+c_2)}{1-c_1(2\varepsilon+c_2)} [c_1 c_3 + c_1(2\varepsilon + c_2)] \right] ||h||. $$ The problem with this bound is that it must be multiple of $\varepsilon$ and $||h||$ in order to get $o(||h||).$ Any suggestions on how to get a better bound will be much appreciated.

For reference: I tried to escape the argument in Deimling's book, where implicit function theorem was proven (without the assertion for differentiability of the implicit function) and then inverse function theorem was proven, and differentiability of the inverse function was established quoite easy.

After that Mr. Deimling proves differentiability of the implicit function using the inverse function theorem; There he assumes $F is C^m$ and considers the following map: $G(x,y):=(x, F'_y(x_0,y_0)^{-1}F(x,y)).$ And claims that since $$ G'(x_0,y_0)(h,k) = (h, k+ F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0)h), $$

$G'(x_0,y_0)$ must be a homeomorphism (which i dont see how its automatically true)

And then $G^{-1}(x,O)=(x,Tx)$ with $T$ exactly the implicit function we are discussing. And now its clear how differentiability (and $C^m$) in inverse function theorem establishes it in the implicit function theorem.

So additional question: Is it obvious that $G'(x_0,y_0) = (Id_X, Id_Y +F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0))$ is homeomorphism?

Update It is obvious... since the right coordinate is translation with linear bounded operator the inverse will be sign-conjugated one: $$G'(x_0,y_0)^{-1}=(Id_X, Id_Y - F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0)).$$ Now it's clear how the rest follows.

Petar
  • 375
  • 2
    The proof of the implicit (or inverse) function theorem gives us that the implicit (resp inverse) function is in fact differentiable, and thus by the formula for the derivative, you can deduce that if $F$ is $C^k$, then so is $T$. I suggest looking over the proof of the theorem, where differentiability is established. From what I remember (in the case of inverse function theorem), the local inverse is shown to first be Lipschitz continuous, and then by a relatively straightforward estimate (using facts previously established in the proof), differentiable. – peek-a-boo Jun 04 '23 at 15:32
  • I'm trying to prove differentiability of the implicit function. Im trying to prove the theorem not use it :). For reference, in Deimling's Nonlinear Functional Analysis its proven indirectly via the Inverse function theorem. But it looks extremely not natural and so my attempts to just get it from trivial (tho long) bounds. – Petar Jun 04 '23 at 15:38
  • 1
    I understand you’re trying to prove it, but I’m saying if you’re stuck, then look up a standard reference, because differentiability is established directly (but slightly painfully) in the proof itself (while extra regularity like $C^k$ can easily be deduced from differentiating and looking at the explicit formula for derivatives and induction). For example, Loomis Sternberg chapter 4.9, theorems 9.3, 9.4 proves implicit function theorem. Abraham Marsden Ratiu prove the inverse, and prove the equivalence of the two IFTs. – peek-a-boo Jun 04 '23 at 15:41
  • 1
    Lang must also have a proof, though I forget which way he proceeds. See also Dieudonne, Henri Cartan etc. But, I should remark that proving the equivalence of the two IFTs is something you should get comfortable with (not just the statement, but the proof of the equivalence). The proof of equivalence is I’d say a very natural idea from linear algebra for converting back and forth between rectangular and square systems of equations. – peek-a-boo Jun 04 '23 at 15:45
  • Thank you for the book references. :) – Petar Jun 04 '23 at 15:46

1 Answers1

5

Most of the treatments of the inverse function theorem or the implicit function theorem are based on finding fixed points of a contraction in Banach spaces. Smoothness of the implicit solution $y\mapsto g(y)$ to the equation $F(g(y),y)=0$ follows from the smoothness of the inverse map $A\mapsto A^{-1}$ (defined on the set of invertible bounded operators on a Banach space $X$) and the smoothness of $F$. The main trick resides in finding good uniform bounds (via the mean value theorem), or by constructing uniform contractions. Here is a sketch of how one may proceed:

Definition Let $U$ and $V$ be open subsets of Banach spaces $X$ and $Y$ respectively. A function $F:\overline{U}\times V\longrightarrow \overline{U}$ is a uniform contraction if there exists $0\leq\theta<1$ such that \begin{align} |F(x,y)-F(x',y)|\leq \theta|x-x'| \qquad x,\,x'\in \overline{U},\, y\in V.\tag{0}\label{unif_contrac} \end{align}

The following theorem shows that fixed point of a uniform contraction $F$ is as smooth as the function $F$.

Theorem (Uniform contraction principle): Suppose $W$ and $V$ are closed and open subsets of Banach spaces $X$ and $Y$ respectively. Let $F:W\times V\longrightarrow W$ be a uniform contraction and let $x_*(y)$ be the unique fixed point of $F(\cdot,y):W\longrightarrow W$.

  1. If $F\in\mathcal{C}(W\times V,X)$, then $x_*\in \mathcal{C}(V,X)$.

Suppose $W=\overline{U}$ where $U$ is an open subset of $X$ and that $F(\overline{U}\times V)\subset U$.

  1. If $F\in\mathcal{C}(\overline{U}\times V,X)$ and $ F\in\mathcal{C}^r(U\times V,X)$ ($r\geq1$), then $x_*\in \mathcal{C}^r(V,X)$, for each $y\in V$ the linear functional $I-\partial_x F(x_*(y),y)\in L(X)$ has a bounded inverse, and \begin{align} x_*'(y)=\Big(I-\partial_xF(x_*(y),y)\Big)^{-1}\partial_yF(x_*(y),y),\quad y\in V\tag{1}\label{smooth-fixedpoint} \end{align}

A proof of this result is at the end of this posting. Having the uniform contraction principle at our disposal we can prove establish the following result:

Theorem (Implicit function theorem): Let $X$, $Y$ and $Z$ be Banach spaces, $\Omega\subset X\times Y$ open and $F\in \mathcal{C}^r(\Omega,Z)$ for some $r\geq0$. When $r=0$ assume that $\partial_xF\in\mathcal{C}(\Omega)$. If $\partial_xF(x_0,y_0)\in\mathcal{L}(X,Z)$ has a bounded inverse for some $(x_0,y_0)\in\Omega$, then there is an open neighborhood $U\times V\subset\Omega$ of $(x_0,y_0)$ and a unique function $g:V\longrightarrow U$ such that \begin{align} g(y_0)=x_0,\qquad F(g(y),y)=F(x_0,y_0). \end{align} Moreover, $g\in\mathcal{C}^r(V,X)$ and if $r\geq1$, then for every $y\in V$ the linear operator $\partial_xF(g(y),y)\in L(X,Z)$ has a bounded inverse, and \begin{align} g'(y)=-\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y),\qquad y\in V.\tag{2}\label{imp_f_deriv} \end{align}

Proof of the implicit function theorem: Define $G:\Omega\longrightarrow X$ by \begin{align} G(x,y)=x-\big(\partial_xF(x_0,y_0)\big)^{-1} (F(x,y)-F(x_0,y_0)) \end{align} Observe that $G$ has the same smoothness as $F$; moreover, $x-G(x,y)=0$ iff $F(x,y)=F(x_0,y_0)$. Since $\partial_xG(x_0,y_0)=0$, for any $0<\theta<1$ there exists open balls $U$ and $V_1$ around $x_0$ and $y_0$ respectively, such that $\overline{U}\times \overline{V_1}\subset\Omega$ and $\sup_{(x,y)\in \overline{U}\times V_1}\|\partial_xG(x,y)\|\leq \theta<1$. The mean value theorem implies that \begin{align} \|G(x,y)-G(x',y)\|\leq\theta\|x-x'\|,\qquad x,\, x'\in \overline{U},\quad y\in V_1 \end{align} Let $\delta=\text{rad}(U)$. Since $F$ in continuous on $U\times V_1$ and \begin{align} \|G(x_0,y)-x_0\|\leq\|\big(\partial_xF(x_0,y_0)\big)^{-1}\| \|F(x_0,y)-F(x_0,y_0)\|, \end{align} there is an open ball $V\subset V_1$ around $y_0$ such that $\|G(x_0,y)-x_0\|<(1-\theta)\delta$. Hence, \begin{align} \|G(x,y)-x_0\|\leq \|G(x,y)-G(x_0,y)\|+\|G(x_0,y)-y_0\|<\delta \end{align} for all $x\in \overline{U}$ and $y\in V$. This shows that $G:\overline{U}\times V\longrightarrow U$ is a uniform contraction with $G\in\mathcal{C}^r(U\times V,X)$. By the uniform contraction principle, for each $y\in V$ there is a unique $g(y)\in U$ such that $F(g(y),y)=F(x_0,y_0)$; moreover, $g\in\mathcal{C}^r(V,X)$ and, if $r\geq 1$, \begin{align} g'(y)=\big(I-\partial_xG(g(y),y)\big)^{-1}\partial_yG(g(y),y)= -\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y) \end{align} for all $y\in V\qquad \Box.$

The inverse function theorem can be obtained as an application of the implicit function theorem.

Theorem(Inverse Function Theorem) Let $X$, $Y$ be Banach spaces, $W\subset X$ open, and let $f\in\mathcal{C}^r(W,Y)$, $r\geq1$. If $f'(x_0)$ has a bounded inverse for some $x_0\in W$, then there exists an open set $U\subset W$ containing $x_0$ such that $f(U)$ is open, $f:U\longrightarrow f(U)$ is bijective, the inverse function $g=f^{-1}\in \mathcal{C}^r(f(U),X)$, and \begin{align} g'(y)=\big(f'(g(y)\big)^{-1}, \qquad y\in f(U)\tag{3}\label{inv_f_deriv}. \end{align}

Proof inverse function theorem: Applying the implicit function theorem to $F(x,y)=y-f(x)$ yields neighborhoods $U'\subset W$ and $V\subset Y$ around $x_0$ and $y_0=f(x_0)$ respectively, such that for each $y\in V$, there exists a unique $g(y)\in U'$ satisfying $y=f(g(y))$. Moreover, the relation $g:y\mapsto g(y)$ is necessarily in $\mathcal{C}^r(V,X)$. This uniqueness shows that $f$ is injective in $U'$.

The set $U=U'\cap f^{-1}(V)$ is an open neighborhood of $x_0$ with $V=f(U)$, and thus, $f:U\longrightarrow V$ is a bijective function whose inverse $f^{-1}=g$. Finally, the identity \ref{inv_f_deriv} follows directly from \eqref{imp_f_deriv} $\Box$.


For completeness, I add a proof of the uniform contraction principle that I have used in the past. I don't remember whether I prove it myself as an exercise or whether it came from a set of notes in a summer school, so I owe you a source, but I am sure is of common knowledge.

First, here is a useful version of the mean value theorem:

Theorem (Mean value theorem): Suppose $F\in\mathcal{C}^1(U,Y)$ where $U\subset X$ is convex. For any $\boldsymbol{x},\,\boldsymbol{y}\in U$, \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M(\boldsymbol{x},\boldsymbol{y})\,\|\boldsymbol{x}-\boldsymbol{y}\| \end{align} where $M(\boldsymbol{x},\boldsymbol{y})=\sup_{0\leq t\leq 1}\|F'(\boldsymbol{x}+t(\boldsymbol{y-x}))\|$.

Conversely, if there is $M\geq0$ such that \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M\|\boldsymbol{x-y}\|,\qquad \boldsymbol{x},\,\boldsymbol{y}\in U, \end{align} then $\sup_{\boldsymbol{x}\in U}\|F'(\boldsymbol{x})\|\leq M$.

The last part of the mean value theorem will be particularly useful in what follows.

(1) Notice that \begin{align} \|x_*(y&+h)-x_*(y)\|=\|F(x_*(y+h),y+h)-F(x_*(y),y)\|\\ &\leq \|F(x_*(y+h),y+h)-F(x_*(y),y+h))\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|\\ &< \theta\|x_*(y+h)-x_*(y)\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|. \end{align} The continuity of $F$ on $W\times V$ implies that \begin{align} \|x_*(y+h)-x_*(y)\|\leq \frac{1}{1-\theta}\|F(x_*(y),y+h)-F(x_*(y),y)\|\xrightarrow{h\rightarrow0}0 \end{align} Hence, $x_*\in\mathcal{C}(V,X)$.

(2) The assumption $F(\overline{U}\times V)\subset U$ implies that $x_*$ maps $V$ into $U$ since $x_*(y)=F(x_*(y),y)$. A formal application of the chain rule yields \begin{align} x'_*(y)=\partial_xF(x_*(y),y)x'_*(y)+\partial_yF(x_*(y),y)\tag{4}\label{formal_der} \end{align} at every $y\in V$ where $x_*$ is differentiable. Consider \eqref{formal_der} as a fixed point equation $T(z,y)=z$ where $T:\mathcal{L}(Y,X)\times V\rightarrow \mathcal{L}(Y,X)$ is given by \begin{align} T(z,y)=\partial_xF(x_*(y),y)z+\partial_yF(x_*(y),y)\tag{5}\label{fix_point_eqn} \end{align} The mean value theorem along with \eqref{unif_contrac} implies that \begin{align} \sup_{(x,y)\in U\times V}\|\partial_xF(x,y)\|\leq\theta\tag{6}\label{unif_bnd_der} \end{align} Hence $T$ is a uniform contraction and, by the first part of the proof, $T$ has a continuous fixed point $z:V\rightarrow\mathcal{L}(Y,X)$.
We will now show that $z$ is in fact the derivative of $x_*$. We fix $y\in V$, and set $B(y)=\partial_xF(x_*(y),y)$, $A(y)=\partial_yF(x_*(y),y)$. Let $h(k):=x_*(y+k)-x_*(y)$ for all $k$ small enough. The fixed point property of $x_*$ and $z$ together with the differentiability of $F$ implies that for all $k$ small enough \begin{align*} (I-B(y))(h(k)-z(y)k)&=F(x_*(y+k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &=F(x_*(y)+h(k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &:=P(h(k),k), \end{align*} where $\frac{\|P(h,k)\|}{\|h\|+\|k\|}\rightarrow0$ as $(h,k)\rightarrow(0,0)$. From \eqref{unif_bnd_der}, we have that $(I-B(y))\in\mathcal{L}(X)$ is an invertible operator with $(I-B(y))^{-1}\in\mathcal{L}(X)$. This shows that \begin{align} x_*(y+k)=x_*(y)+z(y)k+r(k) \end{align} where $r(k)=o(k)$ as $k\rightarrow0$

For $r>1$, the result follows by induction. Suppose the result holds for $r-1$. Then, at least $x\in\mathcal{C}^{r-1}(V,X)$. The fact that $x_*$ satisfies \eqref{formal_der} implies that \begin{align} x_*'(y)=\big(I-\partial_xF(x_*(y),y)\big)^{-1}\partial_yF(x_*(y),y) \tag{7}\label{deriv_implicit} \end{align} Since the map $T\mapsto T^{-1}$ from $GL(X)$ to $GL(X)$ is differentiable, it follows that $x_*\in\mathcal{C}^r(V)$ whenever $F\in\mathcal{C}^r(U\times V,Y)\quad\Box.$

Mittens
  • 46,352