8

I was reading this interesting post about changing the variables in an integral with a generic measure. I was wondering how this applies to the standard change of variable. In other words, $$\int_{F(\Omega)} f d\lambda = \int_{\Omega} f \circ F |\det DF| d\lambda $$ where $d\lambda$ is the lebesgue measure.

I think I have to show that $F_{*}(|det DF|\lambda)=\lambda$ (where $F_*$ is the pushforward of measures). In other words, for every $B$ measurable I need to show $|det DF|\lambda(F^{-1}(B))=\lambda(B)$ but I am not sure how to continue.

edamondo
  • 1,691

1 Answers1

25

First, let us consider a special case rewriting of the theorem presented in your link.

Let $(X,\mathcal{M})$ be a measurable space, $(Y,\mathcal{N},\nu)$ a measure space, $F:X\to Y$ measurable map with measurable inverse. Also, let $F^*\nu:=(F^{-1})_{*}\nu$ be the pullback measure. Then, for any measurable function $f:Y\to [0,\infty]$, and any measurable subset $A\subset X$, we have \begin{align} \int_{F[A]}f\,d\nu=\int_A(f\circ F)\,d(F^*\nu). \end{align}

I leave it to you to see how this follows from the version in the link. In order to ‘remember’ this theorem, I like to think of the following chain of equalities instead: \begin{align} \int_{F[A]}f\,d\nu=\int_{A}F^*(f\,d\nu)=\int_A(F^*f)\,d(F^*\nu):=\int_A(f\circ F)\,d(F^*\nu). \end{align} In this chain, the first and second terms are equal essentially by the definition of pullback of a measure. The first and last terms are equal by the quoted formula above. Finally, $F^*f:=f\circ F$ is standard notation for pullback of functions. I simply arranged the equalities in this manner because it is so reminiscent of the natural behavior exhibited by differential forms under pullback.

I'm also going to assume you know the following measure-theory results

  1. If $\phi:B\subset\Bbb{R}^n\to\Bbb{R}^m$, with $n\leq m$ is locally Lipschitz then it sends (Lebesgue) measure zero sets to (Lebesgue) measure zero sets, and hence Lebesgue-measurable sets to Lebesgue-measurable sets.
  2. If $T:\Bbb{R}^n\to\Bbb{R}^n$ is a linear transformation, then for every Lebesgue-measurable set $A\subset\Bbb{R}^n$, we have $\lambda(T(A))=|\det T|\lambda(A)$.
  3. The Radon-Nikodym theorem
  4. Lebesgue's differentiation theorem

Our hypothesis is that $F:\Omega\to F[\Omega]\subset\Bbb{R}^n$ is a $C^1$ diffeomorphism between open subsets of $\Bbb{R}^n$, and $f:F[\Omega]\to[0,\infty]$ is Lebesgue measurable.

Note that $F$ and its inverse are local diffeomorphisms and hence are locally-Lipschitz, so by (1), they are both Lebesgue-Lebesgue measurable (i.e preimages and direct images of Lebesgue-measurable sets are Lebesgue measurable), and they send Lebesgue measure-zero sets to Lebesgue measure-zero sets. In particular, this implies that for any Lebesgue-measurable subset $A\subset \Omega$, if $\lambda(A)=0$, then \begin{align} (F^*\lambda)(A):=((F^{-1})_{*}\lambda)(A):= \lambda\left((F^{-1})^{-1}(A)\right)=\lambda(F(A))=0. \end{align} Thus, $F^*\lambda\ll\lambda$. So, by the abstract change of variables theorem, and the Radon-Nikodym theorem, we have \begin{align} \int_{F[\Omega]}f\,d\lambda&=\int_{\Omega}(f\circ F)\cdot d(F^*\lambda)\tag{COV}\\ &=\int_{\Omega}(f\circ F)\cdot \frac{d(F^*\lambda)}{d\lambda}\,d\lambda.\tag{Radon-Nikodym} \end{align} Note that strictly speaking, the non-trivial Radon-Nikodym theorem is used to assert the existence of the Radon-Nikodym derivative $\frac{d(F^*\lambda)}{d\lambda}$; the subsequent equality of the integrals is actually a simple exercise.

By Lebesgue's differentiation theorem, for $\lambda$-a.e $x\in \Omega$, we have \begin{align} \frac{d(F^*\lambda)}{d\lambda}(x)&=\lim_{r\to 0^+}\frac{1}{\lambda(B(x,r))}\int_{B(x,r)}\frac{d(F^*\lambda)}{d\lambda}\,d\lambda\\ &=\lim_{r\to 0^+}\frac{1}{\lambda(B(x,r))}\int_{B(x,r)}d(F^*\lambda)\\ &=\lim_{r\to 0^+}\frac{(F^*\lambda)(B(x,r))}{\lambda(B(x,r))}\\ &=\lim_{r\to 0^+}\frac{\lambda(F(B(x,r)))}{\lambda(B(x,r))}. \end{align} Our job is thus to calculate this ratio. So, fix a point $x\in \Omega$, and observe that by writing $F=DF_x\circ \underbrace{(DF_x)^{-1}\circ F}_{:=\phi}$, and using point (2) above, we have \begin{align} \frac{\lambda(F(B(x,r)))}{\lambda(B(x,r))}=|\det DF_x|\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}. \end{align} Now, observe that by the chain rule, and the fact that $(DF_x)^{-1}$ is a fixed linear transformation, we have for all points $y\in \Omega$, $D\phi_y=DF_x^{-1}\circ DF_y$. So, in particular, at the point $x$ we have $D\phi_x=\text{id}_{\Bbb{R}^n}$. Thus, our entire problem has been reduced to proving the following lemma (which is interesting in its own right):

Let $\phi:\Omega\to\Bbb{R}^n$ be a $C^1$ function, where $\Omega\subset\Bbb{R}^n$ is open, such that at a point $x\in \Omega$, we have $D\phi_x=\text{id}_{\Bbb{R}^n}$. Then, $\lim\limits_{r\to 0^+}\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}=1$.

For the proof, fix $0<\epsilon<1$. Then, by definition of $D\phi_x=\text{id}_{\Bbb{R}^n}$, there exists a $\delta_1>0$ such that if $\|h\|\leq \delta_1$ then \begin{align} \|\phi(x+h)-\phi(x)-h\|\leq \epsilon\|h\|. \end{align} So (triangle inequality), for any $r<\delta_1$, we have $\phi(B(x,r))\subset B(\phi(x),(1+\epsilon)r)$.

Next, by the inverse function theorem, $\phi$ is a local $C^1$ diffeomorphism with $D(\phi^{-1})_{\phi(x)}=\text{id}_{\Bbb{R}^n}$ as well, so we can apply the same reasoning as above to deduce there exists $\delta_2>0$ such that for any $r<\delta_2$, we have $\phi^{-1}(B(\phi(x),r))\subset B(x,(1+\epsilon),r)$, or equivalently, $B(\phi(x),r)\subset \phi(B(x,(1+\epsilon)r))$. Thus, if $0<r<\min(\delta_1,\delta_2)$ then \begin{align} B(\phi(x),(1-\epsilon)r)&\subset \phi\bigg(B(x, (1+\epsilon)(1-\epsilon)r)\bigg)\\ &\subset\phi(B(x,r))\\ &\subset B(\phi(x),(1+\epsilon)r). \end{align} Thus, taking measures and dividing by the measure of $B(x,r)$, we get the inequality \begin{align} \frac{\lambda(B(\phi(x),(1-\epsilon)r))}{\lambda(B(x,r))}\leq \frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))} \leq \frac{\lambda(B(\phi(x),(1+\epsilon)r))}{\lambda(B(x,r))}. \end{align} Now, recall that the Lebesgue measure is translation-invariant and that the measure of balls scales as the $n^{th}$ power of the radius. Thus, \begin{align} (1-\epsilon)^n\leq \frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))} \leq (1+\epsilon)^n. \end{align} Since $0<\epsilon<1$ was arbitrary, this shows $\lim\limits_{r\to 0^+}\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}$ exists and equals $1$. Thus, the entire proof is complete.


To recap:

  1. We first invoked the general change of variables theorem.
  2. Next, we used the Radon-Nikodym theorem to express the integral with respect to $F^*\lambda$ as an integral with respect to $\lambda$, so our problem is now to calculate the Radon-Nikodym derivative $\frac{d(F^*\lambda)}{d\lambda}$.
  3. Lebesgue's differentiation theorem tells us that the Radon-Nikodym derivative can actually be calculated as a limit of a quotient of the two measures applied to balls (or really any other nicely shrinking set).
  4. We then used the fact that a linear transformation $T$ (in our case $DF_x$ for a fixed $x\in\Omega$) distort Lebesgue measure by a factor of $|\det T|$, in order to reduce to the case where $DF_x=\text{id}_{\Bbb{R}^n}$.
  5. Finally, we used that lemma to show the limit is $1$.
peek-a-boo
  • 65,833
  • This is a much clearer proof, thank you, and I am now able to read and understand it. Just one thing: $$D\phi_x=D\left[(DF_x)^{-1}\circ F\right]=D((DF_x)^{-1}){F(x)}\circ DF_x=D(DF^{-1}_x){F(x)}\circ DF_x$$And I'm not sure how you went from there – FShrike Jan 01 '22 at 14:51
  • 1
    @FShrike remember that $x$ is fixed so $(DF_x)^{-1}$ is a fixed linear transformation recall that linear transformations are their own derivatives at every point. So, for $\phi=(DF_x)^{-1}\circ F$, and any point $y\in\text{Domain}(F)$, we have $D\phi_y=D\bigg((DF_x)^{-1}\bigg){F(y)}\circ DF_y= (DF_x)^{-1}\circ DF_y$. Hence, if we want the derivative at point $x$, then $D\phi_x=\text{id}{\Bbb{R}^n}$. – peek-a-boo Jan 01 '22 at 14:55
  • "Linear transformations are their own derivative" - It's nice to hear that holds in multiple variables (of course!). Thank you for your continued help – FShrike Jan 01 '22 at 14:58
  • I just made it more specific "at every point", because for a linear $T:V\to W$, it is not true that $DT=T$ (they don't even have the same target space). But for each $y$, $DT_y=T$, i.e $DT:V\to\text{Hom}(V,W)$ is a constant map with value $T$. See here for the proof if you want. – peek-a-boo Jan 01 '22 at 15:01
  • Yes, $DT$ would be represented by some tensor I expect – FShrike Jan 01 '22 at 15:03
  • I'll understand this one day... – Clemens Bartholdy Feb 17 '22 at 14:25
  • Can you guys indicate to me a reference where I can find the general change of variables formula using the pull-back measure ? – ThiagoGM Nov 23 '24 at 16:55
  • 1
    @LucasLinhares that’s a standard fact. Look at OP’s link. The proof is the usual thing: for indicator functions it’s true by definition, by linearity it’s true for non-negative simple functions, by monotone convergence it is true for all non-negative measurable functions, hence for integrable ones it is by considering positive and negative parts. – peek-a-boo Nov 23 '24 at 18:00