1

I read the proof for Hadamard's lemma available on Wikipedia but I do not understand the application of the chain rule in the proof.

In particular, $f$ is a smooth function defined on an open star-convex set $U$ of $n$-dimensional Euclidean space and $a\in\mathbb{R}^n$

The proof defines a function $h$ by: $$h(t)=f(a+t(x-a))$$ for a fixed value of $x\in U$ and $0\leq t\leq 1$.

What I don't understand is the application of the chain rule $$h'(t)=\sum_{k=1}^n\frac{\partial f}{\partial x_k}(a+t(x-a))(x_k-a_k).$$

This since I believe that the partial derivatives should not be taken with respect to the $x_k$, but with respect to the argumentes passed to $f$.

Formally, let $u_k(t)=a_k+t(x_k-a_k)$, then the chain rule states that: $$h'(t)=\sum_{k=1}^n\frac{\partial f}{\partial u_k}(a+t(x-a))\frac{d u_k}{d t}=\sum_{k=1}^n\frac{\partial f}{\partial u_k}(a+t(x-a))(x_k-a_k).$$

At least that is what I believe and, by comparison, if we compute the partial derivative as stated in the proof of the lemma I get $$\frac{\partial f}{\partial x_k}(a+t(x-a))=t\frac{\partial f}{\partial u_k}(a+t(x-a)).$$

Which is not the same thing.

  • $h’(t)=\sum_{k=1}^n(\partial_kf)(a+t(x-a))\cdot (x_k-a_k)$; when written this way, there is no confusion at all. Here, $\partial_kf$ means the $k^{th}$ partial derivative of the function $f$; see here for more on terminology and notation. This equality is a trivial consequence of the chain rule ($D(\phi\circ\psi)a=D\phi{\phi(a)}\circ D\psi_a$ in a special case… see the first part of my answer here). Also notation in more abstract situations. – peek-a-boo Jul 24 '24 at 00:56
  • The ‘mistake’ you’re making in your last line is you’re trying to compute the partials of the composite $x\mapsto f(a+t(x-a))$, but in fact the notation $\frac{\partial f}{\partial x_k}(a+t(x-a))$ means by definition $(\partial_kf)(a+t(x-a))$, i.e a partial of $f$ evaluated at a weird point. If you want to write partials of the composite, you should write $\frac{\partial }{\partial x_k}\left(f(a+t(x-a))\right)$. The bracketing makes a difference as to the interpretation. Anyway, all this extra business with $x_k$ vs $u_k$ should be avoided on first pass through as I explain in my first link. – peek-a-boo Jul 24 '24 at 01:00

0 Answers0