A clean proof of Chain rule for BV functions

Question

Let $u\in BV(\mathbb{R})$ then its distributional derivative $u' \in \mathcal{M}(\mathbb{R})$(the space of bounded radon measures). I am trying to prove that $f\in C^1(\mathbb{R})$, the function $f\circ u \in BV(\mathbb{R})$ satisfies the chain rule $(f\circ u)'=(f'\circ u)u'$ in the sense of measures by a smoothening argument.

For $u\in BV(\mathbb{R})$, let $u^{\epsilon}$ denote its mollification, so that $\left|u^{\epsilon}\right|_{BV(\mathbb{R})}=\left|u\right|_{BV(\mathbb{R})}$ thus upto a subsequence $u^{\epsilon} \rightarrow u$ in $L^1_{loc}(\mathbb{R})$ and ${u^{\epsilon}}' \rightarrow u'$ in weak* $\mathcal{M}(\Omega)$.

Now, consider \begin{align} \int_{\mathbb{R}} \phi (f \circ u)'= \lim_{\epsilon \rightarrow 0} \int_{\mathbb{R}} \phi (f \circ u^{\epsilon})'= \lim_{\epsilon \rightarrow 0} \int_{\mathbb{R}} \phi (f' \circ u^{\epsilon}) {u^{\epsilon}}' = \lim_{\epsilon \rightarrow 0} \int_{\mathbb{R}} \phi (f' \circ u) {u}' \label{1}\tag{1} \end{align} which implies that the chain rule holds. I have the follwoing doubts

Is the above argument correct? especially the last equality in \eqref{1} which uses product of a weakly and strongly convergent sequence is weakly convergent
Can we justify the same if $f$ is only Lipschitz continuous in particular $f(x)=|x|$ and $f'(x)=\mathrm{sgn}(x)$.

If $u$ is Heaviside function then $(f \circ u)' = (f(1) - f(0))\delta \neq f'(1)\delta = (f' \circ u)u'$ where $\delta$ is Dirac delta distribution. — dsh, Nov 12 '23 at 22:24

Daniele Tampieri · Answer 1 · 2023-11-15T16:48:56.870

Answer to question 1.

Is the above argument correct? especially the last equality in \eqref{1} which uses product of a weakly and strongly convergent sequence is weakly convergent

Yes, but I'd choose the following route to proof the result, as it is more detailed even if more complex. Let's proceed step by step.

The first thing to note is that $\{(f\circ u^\varepsilon)^\prime\}_{\varepsilon\in ]0, 1] } = \{(f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime\}_{\varepsilon\in ]0, 1] }$ is a sequence of Radon measures. Indeed $$ C_c(\Bbb R)\ni\varphi\mapsto \int\limits_{\Bbb R} \varphi ( f^\prime\circ u^\varepsilon ) {u^\varepsilon}^\prime \in{\Bbb R} $$ is a continuous map for every $\varepsilon\in\, ]0, 1]$ since
- $f^\prime\circ u^\varepsilon(x)$ is a continuous function by construction, $\varphi(x)\cdot ( f^\prime\circ u^\varepsilon)(x)$ is a continuous map from $C_c(\Bbb R)$ to itself and
- ${u^\varepsilon}^\prime\in \mathcal{M}(\Bbb R)$ again by construction and the conclusion holds by Riesz theorem (see for example [1], §1.4, p. 26, theorem 1.54 and remark 1.57).
Edit. At this point we still do not know if the sequence converges. We just know that the have a sequence of Radon measures which is obtained fro two weakly* convergent sequences of Radon measures: we have not used this latest fact and, as we'll see, we will not use it in the following step.

Second edit I am not sure that the argument used for this part of the answer is entirely correct. In particular, thinking about the bound \eqref{bd}, I doubt it is effective. I do not delete the entire answer just because the second part is correct and it includes also the case when $f\in C^1$.
Now recall the definition of locally weak* convergence and a lemma on the convergence of sequences of Radon measures.
- (see [1], §1.4, p. 27, definition 1.58) Let $\mu$ and the sequence $\{\mu_h\}_{h}$ be $\Bbb R^m$-valued ($m\ge 1$) Radon measures on the locally convex separable metric space $X$; we say that $\{\mu_h\}$ locally weakly* converges to $\mu$ if $$\DeclareMathOperator{\Dd}{d\!} \lim_{h\to\infty} \int\limits_X \varphi \Dd \mu_h =\int\limits_X \varphi \Dd \mu $$ for every $\varphi \in C_c(X)$.
- A corollary of the classical De la Valle Pussin compactness criterion for finite Radon measures (see [1], §1.4, p. 28, corollary 1.60). If a sequence $\{\mu_h\}_{h}$ of Radon measures on a locally convex separable metric space $X$ is such that $$ \sup\{|\mu_h|(K) \mid h\in\Bbb N\}<+\infty $$ where $\lvert\mu_h\rvert$ is the total variation of the measure $\mu_h$, and the inequality holds for every compact set $K\subset X$ then it has a locally weakly* converging subsequence.
But then we are done, as the family $\{(f\circ u^\varepsilon)^\prime\}_{\varepsilon\in ]0, 1] } = \{(f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime\}_{\varepsilon\in ]0, 1] }$ satisfies exactly these requirements. Indeed, choosing $\varepsilon =\frac{1}{n}$, $n\in\Bbb N$ and using Hölder's inequality we have $$ \begin{split} \lvert u^\varepsilon(x)\rvert & = \left \lvert\int\limits_{\Bbb R} u(x-y)\nu_{1\over n}(y)\Dd y\right\rvert \\ & = \left \lvert\int\limits_{\Bbb R} u(x-y)n \nu(ny)\Dd y\right\rvert\\ & = \left \lvert\int\limits_{\Bbb R} u\left(x-\frac{z}{n}\right)\nu(z)\Dd z\right\rvert\\ & \le \int\limits_{\Bbb R} \left\lvert u\left(x-\frac{z}{n}\right)\nu(z)\right\rvert\Dd z \\ & \le \lVert u\rVert_{L^1}\lVert \nu\rVert_\infty < + \infty \end{split} $$ i.e. the mollified version $u^\varepsilon$ of a $L^1$ function is bounded and its upper bound does not depend on the value of $\varepsilon$, thus $$ \begin{split} \sup_{\substack{\varphi\in C_c (K)\\ \|\varphi\|_{L^{\infty}} \leq 1}} \int\limits_K \varphi (f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime \Dd x &= \sup_{\substack{\varphi\in C_c (K)\\ \|\varphi\|_{L^{\infty}} \leq 1}} \int\limits_K \varphi (f\circ u^\varepsilon)^\prime\Dd x\\ &= \sup_{\substack{\varphi\in C_c (K)\\ \|\varphi\|_{L^{\infty}} \leq 1}}\int\limits_K \varphi^\prime (f\circ u^\varepsilon)\Dd x \\ &\le M_f \sup_{\substack{\varphi\in C_c (K)\\ \|\varphi\|_{L^{\infty}} \leq 1}} \int\limits_K \varphi^\prime \Dd x \end{split}\label{bd}\tag{BD} $$ where $M_f=\sup_{|x|\le \lVert u\rVert_{L^1}\lVert \nu\rVert_\infty}f$

The formula $$ \lim_{\varepsilon\to 0}(f\circ u^\varepsilon)^\prime = \lim_{\varepsilon\to 0} (f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime\triangleq (f\circ u)^\prime \triangleq (f^\prime\circ u)u^\prime $$ holds true and, as stated above, we do not need the fact that the sequence $\{(f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime\}_{\varepsilon\in ]0, 1] }$ is the product of two locally weak* convergent sequences.

Answer to question 2.

Can we justify the same if $f$ is only Lipschitz continuous in particular $f(x)=|x|$ and $f'(x)=\mathrm{sgn}(x)$.

Yes, it may be possible to proceed as shown above: nevertheless we can obtain the most general result by using the classical definition of variation for functions of one real variable. Precisely, assuming that $f$ is Lipschitz with constant $M>0$ i.e. $f\in C^{0,1}(\Bbb R)$ $\lvert f(x)-f(y)\rvert \le M \lvert x-y \rvert$ fro all $x,y \in\Bbb R$, we have that $$ \begin{split} V_a^b(f\circ u) & =\sup_{P \in \mathscr{P}} \sum_{i=0}^{n_{P}-1} | f\circ u(x_{i+1})-f\circ u(x_i)|\\ &\le M \sup_{P \in \mathscr{P}} \sum_{i=0}^{n_{P}-1} | u(x_{i+1})- u(x_i)| =M V_a^b(u)\quad \forall a, b\in \Bbb R. \end{split} $$ This implies that $f\circ u\in BV_\text{loc}(\Bbb R)$ thus its derivative is a Radon measure. Furthermore since $f\in C^{0,1}(\Bbb R)$ it is also absolutely continuous, almost everywhere differentiable and essentially bounded by its Lipschitz constant and we can express its first derivative as $$ (f\circ u)^\prime = (f^\prime\circ u)u^\prime $$ since for all $\varphi \in C_c(\Bbb R)$ $$ \begin{split} \left|\int\limits_{\Bbb R} \varphi ( f\circ u )^\prime\right| & = \left|\int\limits_{\Bbb R} \varphi ( f^\prime\circ u ) {u}^\prime\right|\\ & \le M \left|\int\limits_{\Bbb R} \varphi {u}^\prime\right|< +\infty \end{split} $$ as $u\in BV(\Bbb R)$.

Notes

For the sake of completeness, as briefly shown in this Q&A, a deeper result is true :

Theorem (Josephy [2], p. 355, theorem 4) For a given function $f:[0,1]\to[0,1]$, the composition $f\circ g$ is of bounded variation for all functions $g:[0,1]\to[0,1]$ of bounded variation if and only if $f$ satisfies a Lipschitz condition on $[0,1]$.

The fact that the domain and codomain of the functions in the statement of theorem is $[0,1]$ does not reduce its generality: every finite interval will do.
In the same paper, theorem 3 states a necessary and sufficient condition for a function $g$ such that $f\circ g$ is of bounded variation for each $f$ of bounded variation, and identifies the class for such functions.

References

[1] Luigi Ambrosio, Nicola Fusco, Diego Pallara, Functions of bounded variation and free discontinuity problems, Oxford Mathematical Monographs, New York and Oxford: The Clarendon Press/Oxford University Press, New York, pp. xviii+434 (2000), ISBN 0-19-850245-1, MR1857292, Zbl 0957.49001.

[2] Michael Josephy, "Composing Functions of Bounded Variation", Proceedings of the American Mathematical Society Vol. 83, No. 2, pp. 354-356, (1981), MR624930, Zbl 0475.26005.

I apologize I may have not cured all the details as needed for the Lipschtz case: as a matter of fact, the fact that the a composition of Lipschitz function and a $BV$-function is again a $BV$-function is one of the core result of the theory. I'll try to check and correct the details if needed — Daniele Tampieri, Nov 14 '23 at 16:38
Thanks for the answer..But it's not clear to me how you are justifying the product of two weak* converging sequences ...Could u please elaborate a bit... — Veronica, Nov 14 '23 at 20:32
Hi Veronica, just a few words here before I add something more to clarify this point in the answer tomorrow morning. The sequence ${(f\circ u^\varepsilon)^\prime}{\varepsilon\in ]0, 1] } = {(f^\prime\circ u^\varepsilon){u^\varepsilon}^\prime}{\varepsilon\in ]0, 1] }$ is a sequence of Radon measures and it doesn't really matter that each of its single term is the product of two terms for a weakly* converging sequence of Radon measures (note that this last fact is not used for the proof). — Daniele Tampieri, Nov 14 '23 at 20:55
...Continue fro the previous comment. The crucial step in the proof is noting that this sequence has a variation which is bounded and thus its convergence follows from the corollary of the theorem of De La Valée Pussin. I do not tried to prove that the product of two weakly* convergent sequences of Radon measures is weakly* convergent because in this case there are obvious difficulties in dealing with the algebraic side of the problem. — Daniele Tampieri, Nov 14 '23 at 21:01
Thanks for your comments and updating ur answer..I am sorry but I am not convinced yet. The corollary u mentioned only assures that the sequence of measures converges to some measure (say $mu$)but what assures that this measure is indeed a product of a function $f'(u)$ and a measure $u'$ — Veronica, Nov 15 '23 at 21:20
@Veronica, thank you for your patience. I am not convinced: currently I trust only for second part ($f\in C^{0,1}(\Bbb R)$). I'll try to see if the first part is salvageable or not. I should add that the a proof of the $C^1$ case in dimension $N>1$ is included in an old paper of Vol'pert while the general multidimensional Lipschitz case is proved in the book of Ambrosio et al — Daniele Tampieri, Nov 15 '23 at 21:20
@Veronica I think I have a different nice route to the proof of part one, and I need some time to adjust the details. — Daniele Tampieri, Nov 17 '23 at 08:14
Thank you.. Take the time you need to refine the details, and I look forward to seeing more about your alternative approach — Veronica, Nov 17 '23 at 10:07

A clean proof of Chain rule for BV functions

1 Answers1

Answer to question 1.

Answer to question 2.

References