1

Let $f:\mathbb R^d \to \mathbb R, x \mapsto |x|^2$. Then $\nabla f (x) = 2x$ and $\operatorname{H} f (x) = 2I_d$. For $p \in \mathbb N^*$, let $g_p : \mathbb R^d \to \mathbb R, x \mapsto |x|^{2p}$. Then $g_p = f^p$. By chain rule, $\nabla g_p (x) = p f^{p-1} (x) \nabla f (x)$.

My goal is to compute the Hessian matrix of $g_p$. Every vector is considered as column vector. We need the following lemma

Lemma Let $f:\mathbb R^d \to \mathbb R^d$ and $g:\mathbb R^d \to \mathbb R$ be differentiable. Let $\mathrm J f (x)$ be the Jacobian of $f$ at $x$. Let $\nabla g (x)$ be the gradient of $g$ at $x$. Then $\mathrm J (fg)(x) = f (x) (\nabla g (x))^\top + g(x) \mathrm J f (x)$.

It follows that $$ \begin{align} \operatorname{H} g_p (x) &= \operatorname{J} (\nabla g_p) (x) \\ &= p \operatorname{J} (f^{p-1} \nabla f) (x) \\ &= p \big [ \nabla f (x) (\nabla f^{p-1} (x))^\top + f^{p-1} (x) \operatorname{J} (\nabla f) (x) \big ] \quad \text{by Lemma}\\ &= p \big [ \nabla f (x) (\nabla f^{p-1} (x))^\top + f^{p-1} (x) \operatorname{H} f (x) \big ] \\ &= 2p \big [ (p-1)f^{p-2} (x) x ( \nabla f (x))^\top + f^{p-1} (x) I_d \big ] \quad \text{by} \quad \nabla f^{p-1} = \nabla g_{p-1}. \end{align} $$

Could you confirm if my above computation is fine?

Analyst
  • 6,351

1 Answers1

1

For $i, j\in \{1, \ldots, n\}$, we have \begin{align} (\text{Hess }g_p)_{ij} &= \partial_{i}\left(\partial_j (f(x))^p\right)\\ &= \partial_i\left(pf(x)^{p-1}\partial_if(x)\right)\\ &= p\left\{(p-1)f(x)^{p-2}\partial_i f(x)\partial_jf(x) + f(x)^{p-1} \partial^{2}_{ij}f(x)\right\}\\ &= pf(x)^{p-2}\left\{(p-1)\partial_i f(x)\partial_jf(x) + f(x)\partial_{ij}f(x)\right\} \end{align} Therefore, we get $$\text{Hess }g_p = pf^{p-2}\left\{(p-1)[\nabla f]^T [\nabla f] + f \text{Hess }f\right\}.$$

Falcon
  • 4,433