0

I am reading the Lee's Introduction to smooth manifold, Theorem 4.12 and stuck at understanding some statements.

Theorem 4.12 ( Rank Theorem ). Suppose $M$ and $N$ are smooth manifolds of dimensions $m$ and $n$ respectively, and $F:M\to N$ is a smooth map with constant rank $r$. For each $p\in M$ there exist smooth charts $(U,\varphi)$ for $M$ centered at $p$ and $(V, \psi)$ for $N$ centered at $F(p)$ such that $F(U) \subseteq V$, in which $F$ has a coordinate representation of the form $$ \hat{F}(x^1, \dots, x^r,x^{r+1},\dots x^m)=(x^1,\dots , x^r,0,\dots ,0).$$

Proof. Because the theorem is local, after choosing smooth coordinates we can replace $M$ and $N$ by open subsets $U \subseteq\mathbb{R}^{m}$ and $V \subseteq\mathbb{R}^{n}$ . The fact that $D F ( p )$ has rank $r$ implies that its matrix has some $r \times r$ submatrix with nonzero determinant. By reordering the coordinates, we may assume that it is the upper left submatrix, $\left( \partial F^{i} / \partial x^{j} \right)$ for $i, j=1, \ldots, r$ .

Q.1. Why the bold statement is true? How such mechanism works? At first glance it seems work but I can't figure out it rigorously yet. Can anyone explain this more friendly?

( Continuing proof ) Let us relabel the standard coordinates as $( x, y )=$ $\left( x^{1}, \ldots, x^{r}, y^{1}, \ldots, y^{m-r} \right)$ in $\mathbb{R}^{m}$ and $( v, w )=\left( v^{1}, \ldots, v^{r}, w^{1}, \ldots, w^{n-r} \right)$ in $\mathbb{R}^{n}$ By initial translations of the coordinates, we may assume without loss of generality that $p=( 0, 0 )$ and $F ( p )=( 0, 0 )$ . If we write $F ( x, y )=\big( Q ( x, y ), R ( x, y ) \big)$ for some smooth maps $Q \colon U \to\mathbb{R}^{r}$ and $R \colon U \to\mathbb{R}^{n-r}$ , then our hypothesis is that $\left( \partial Q^{i} / \partial x^{j} \right)$ is nonsingular at $( 0, 0 )$ . Define $\varphi\colon U \to\mathbb{R}^{m}$ by $\varphi( x, y )=\big( Q ( x, y ), y \big)$ . Its total derivative at $( 0, 0 )$ is $$ D \varphi( 0, 0 )=\left( \begin{matrix} {{{{\frac{\partial Q^{i}} {\partial x^{j}} ( 0, 0 )}}}} & {{{{\frac{\partial Q^{i}} {\partial y^{j}} ( 0, 0 )}}}} \\ {{{{0}}}} & {{{{\delta_{j}^{i}}}}} \\ \end{matrix} \right), $$

where we have used the following standard notation: for positive integers $i$ and $j$, called the Kronecker delta, is defined by $$ \delta_{j}^{i}=\left\{\begin{matrix} {{1}} & {{\mathrm{i f ~} i=j,}} \\ {{0}} & {{\mathrm{i f ~} i \neq j.}} \\ \end{matrix} \right. \tag{4.4} $$ The matrix $D \varphi( 0, 0 )$ is nonsingular by virtue of the hyposthesis. Therefore, by the inverse function theorem, there are connected neighborhoods $U_0$ of $(0,0)$ and $\widetilde{U}_0$ of $\varphi(0,0) = ( 0,0)$ such that $\varphi\colon U_{0} \to\tilde{U}_{0}$ is a diffeomorphism. By shrinking $U_0$ and $\widetilde{U}_0$ if necessary, we may assume that $\widetilde{U}_{0}$ is an open cube. Writing the inverse map as $\varphi^{-1} ( x, y )=\left( A ( x, y ), B ( x, y ) \right)$ for some smooth functions $A: \widetilde{U}_0\to \mathbb{R}^{r}$ and $B \colon{\widetilde{U}}_{0} \to\mathbb{R}^{m-r}$ , we compute $$ ( x, y )=\varphi\big( A ( x, y ), B ( x, y ) \big)=\big( Q \big( A ( x, y ), B ( x, y ) \big), B ( x, y ) \big). \tag{4.5} $$ Comparing $y$ components shows that $B ( x, y )=y$ , and therefore $\varphi^{-1}$ has the form $$ \varphi^{-1} ( x, y )=\big( A ( x, y ), y \big). $$ On the other hand, $\varphi\circ\varphi^{-1}=\mathrm{I d}$ implies $Q \big( A ( x, y ), y \big)=x$ , and therefore $F \circ\varphi^{-1}$ has the form $$ F \circ\varphi^{-1} ( x, y )=\big( x, \widetilde{R} ( x, y ) \big), $$ where ${\widetilde{R}} \colon{\widetilde{U}}_{0} \to\mathbb{R}^{n-r}$ is defined by $\widetilde{R} ( x, y )=R \big( A ( x, y ), y \big)$ . The Jacobian matrix of this composite map at an arbitrary point $( x, y ) \in{\widetilde{U}}_{0}$ is $$ D \big( F \circ\varphi^{-1} \big) ( x, y )=\left( \begin{matrix} {{{\delta_{j}^{i}}}} & {{{0}}} \\ {{{\frac{\partial\widetilde{R}^{i}} {\partial x^{j}} ( x, y )}}} & {{{\frac{\partial\widetilde{R}^{i}} {\partial y^{j}} ( x, y )}}} \\ \end{matrix} \right). $$

Since composing with a diffeomorphism does not change the rank of a map, this matirx has rank $r$ everywhere in $\widetilde{U}_0$. The first $r$ columns are obviously linearly independent. so the rank can be $r$ only if the derivatives $\partial\tilde{R}^{i}/\partial y^{j}$ vanish identically on $\tilde{U}_0$, which implies that $\widetilde{R}$ is actually independent of $(y^1, \dots, y^{m-r})$. ( This is one reason we arranged for $\widetilde{U}_0$ to be a cube. ). Thus, if we set $S(x)=\tilde{R}(x,0)$ , then we have $$ F\circ \varphi^{-1}(x,y) = ( x, S(x)).$$

(Next proof is omitted ).

Q.2. Why the second bold statement is true? ; i.e., why the first $r$ columns of $D(F \circ \varphi^{-1})(x,y)$ are 'obviously' linearly independent? And for showing that $\tilde{R}$ is actually independent of $(y^1,\dots, y^{m-r})$, how the condition " $\tilde{U_0}$ is a cube ", is used?

Can anyone help?

Plantation
  • 3,710

1 Answers1

2

Too long for a comment, but

  • Q1: You’re just composing by suitable affine transformations on the domain and target. See Trying to understand the statement of Rudin's Rank Theorem for more details.
  • Q2: A matrix of the form $\begin{pmatrix}I_r&0\\ A&B\end{pmatrix}$ obviously has the first $r$ columns independent. Call these columns $\xi_i=\begin{pmatrix}e_i\\y_i\end{pmatrix}$ for $i\in\{1,\dots, r\}$ (and $e_i\in\Bbb{R}^r$ the standard column vector with $1$ in $i^{th}$ spot and $0$ elsewhere). If $c_1,\dots, c_r\in\Bbb{R}$ are such that $\sum_{i=1}^rc_i\xi_i=0$, then in particular, $\sum_{i=1}^rc_ie_i=0$. This says precisely that $\begin{pmatrix}c_1\\\vdots\\c_r\end{pmatrix}=0$, thus proving linear independence of the $\xi_i$’s.
  • Q2: But in fact, we don’t even need to do all that. Clearly, because we have $I_r$ in the top left, we can do row operations (which of course do not affect the rank) to say that $\begin{pmatrix}I_r&0\\ A&B\end{pmatrix}$ and $\begin{pmatrix}I_r&0\\ 0&B\end{pmatrix} $ have the same rank, namely $r+\text{rank}(B)$. So, this has rank $r$ if and only if $\text{rank}(B)=0$, i.e if and only if $B=0$.
  • Q2, last part: Fix $x$, and consider $\rho(y)=\widetilde{R}(x,y)$. Then the bottom right block $B$ is precisely $D\rho(y)$. Going from $D\rho(y)=0$ to $\rho=\text{const}$ obviously requires some assumptions on the domain of $\rho$, by the mean-value inequality. So, we should really assume that $\widetilde{U}_0$ is such that for each $x$, the set of $y$’s such that $(x,y)\in\widetilde{U}_0$ is connected. By taking $\widetilde{U}_0$ to be a cube, this is trivially satisfied and you can prove this independence of $y$ using the FTC by integrating the zero derivative along any straight line parallel to the $y$-coordinate axes.
peek-a-boo
  • 65,833
  • Thank you. 1) For the answer for the Q.1, I think I don't understand completely yet. What part of the answer to your linked post would be helpful for understanding? Perhaps, can you explain with some more simpler case/example ( e.g. when $n , m ,r $ are small )? 2) For the last part of answer for the Q.2, although it seems work, can you exhibit proof $D\rho(y) =0 \Rightarrow \rho = \operatorname{const}$, using the fact that "for each $x$, the set of $y$'s such that $(x,y) \in \tilde{U_0}$ is connected" and the FTC ( Fundamental of Calculus ? ), more formally? – Plantation Jun 19 '25 at 09:40
  • Is it related to the elementary fact that "on a connected open subset, the zero-derivative means the constancy" ? Anyway again thank you. – Plantation Jun 19 '25 at 09:40
  • @Plantation 1.) see “Stage 2” and within that “affine change of coordinates” – peek-a-boo Jun 19 '25 at 16:03
  • 1
    2.) it is precisely the elementary fact you mentioned. The FTC is just a different proof (which can only be applied well… when the derivative has sufficient regularity that thr FTC can be applied. So for purely differentiable functions this would not work, but here we have smooth guys so this is fine). Anyway I’m simply referring to the very elementary multivariable calculus fact – peek-a-boo Jun 19 '25 at 16:05
  • @Plantation note that the stuff about row/column operations is not an alternate proof. I did mention it explicitly (albeit briefly) in my linked answer (“Stage 1”), where rathr than just shifting the matrix to the top left, I also reduced it fully to the identity $I_r$. Perhaps you didn’t find that explanation satisfactory, but in that case you should add your own answer, because this is a significant “edit” of an existing answer. So, I’m going to rollback just so it’s clear who wrote what. – peek-a-boo Jun 23 '25 at 01:59
  • Aha O.K. Sorry for radical Edit :) Anyway thank you. – Plantation Jun 23 '25 at 02:09