2

I have to evaluate the derivative $$ \frac{\partial\det\mathcal{U}}{\partial F} $$ where $\mathcal{U}=\sqrt{F^TF}$ and $F$ is a $m\times n$ real matrix. Any suggestion would be appreciated.

Thank you all, guys!! You helped me a lot.

2 Answers2

3

You can also approach the problem using differentials instead of the chain rule. It is easy to work with differentials, because algebraically they act like ordinary matrices.

Define a new matrix variable $W$, and its differential $$\eqalign{ W &= U^2 = F^TF = W^T \cr dW &= 2\,\operatorname{sym}(F^T\,dF)\cr }$$where $\operatorname{sym}(A)=\frac{1}{2}\,(A+A^T)$ is the symmetrization operation.

Now you want to find the gradient of $$\eqalign{ g &= \det(U) = \sqrt{\det W} \cr }$$ but it's more convenient work with its logarithm instead $$\eqalign{ h &= \log(g) = \frac{1}{2}\,\log\det W\cr dh &= d\log(g) = \frac{1}{2}\,d\operatorname{tr}\log W \cr \frac{dg}{g} &= \frac{1}{2}\,W^{-1}:dW \cr dg &= \frac{g}{2}\,W^{-1}:dW \cr &= g\,W^{-1}:\operatorname{sym}(F^T\,dF) \cr &= g\,\operatorname{sym}(W^{-1}):F^T\,dF \cr &= g\,W^{-1}:F^T\,dF \cr &= g\,FW^{-1}:dF \cr \cr }$$ where colon denotes the double-dot (aka Frobenius) product, which can be defined as $$A:B=\operatorname{tr}(A^TB)$$

So the gradient of interest is $$\eqalign{ \frac{\partial g}{\partial F} &= gFW^{-1} \cr &= F(F^TF)^{-1}\det\sqrt{F^TF} \cr }$$

greg
  • 40,033
  • Thanks. That`s really like the result I expect. Let me go through!!!! – Asatur Khurshudyan Dec 21 '16 at 12:24
  • However, I can`t understand those two formula. $$ dh=\frac{1}{2}d\operatorname{tr}\left[\log\left(W\right)\right] $$ and $$ \frac{dg}{g}=\frac{1}{2}W^{-1}:dW. $$ Could you please explain them to me? Many thanks. – Asatur Khurshudyan Dec 21 '16 at 12:40
  • 1
    The first one is a corollary of Jacobi's formula $$d\log\det X=d\operatorname{tr}\log X$$The second comes from the logarithmic derivative $$d\log x = \frac{dx}{x}$$ and the differential of the trace of a function $$d\operatorname{tr}f(X)=f^\prime(X^{T}):dX$$ – greg Dec 21 '16 at 15:31
  • Are you sure about the first formula? The corollary of Jacobi`s formula I found here https://en.wikipedia.org/wiki/Jacobi%27s_formula says that $$ d\log\det X=\operatorname{tr}d\log X. $$ – Asatur Khurshudyan Dec 21 '16 at 15:40
  • 1
    @AsaturKhurshudyan The trace is linear so $,,,d\operatorname{tr}f=\operatorname{tr}df,,,$ – lynn Dec 21 '16 at 17:49
  • Could you please explain also how you get $$ g,W^{-1}:\operatorname{sym}(F^T,dF)=g,\operatorname{sym}(W^{-1}):F^T,dF? $$ – Asatur Khurshudyan Dec 22 '16 at 11:25
  • 1
    The value of the Frobenius product is unchanged if both arguments are transposed $$A^T:B=A:B^T$$ Using this fact $$\eqalign{B:(A+A^T) &= B:A + B:A^T \cr &= B:A + B^T:A \cr &= (B+B^T):A \cr}$$Omitting the factor of $\frac{1}{2}$, this is exactly what the sym() operator is doing. – greg Dec 23 '16 at 01:40
  • http://math.stackexchange.com/questions/2125499/second-derivative-of-det-sqrtftf – Asatur Khurshudyan Feb 02 '17 at 10:32
0

Hint.

Name $f(F) = \det \sqrt{F^T F}$ You can write $f= f_3 \circ f_2 \circ f_1$ as the function composition of

$$\begin{array}{l|rcl} f_1 : & \mathcal M_{m,n}(\mathbb R) & \longrightarrow & \mathcal S^+(\mathbb R^n) \subseteq \mathcal M_{n}(\mathbb R) \\ & F & \longmapsto & F^T F \end{array}$$

$$\begin{array}{l|rcl} f_2 : & \mathcal S^+(\mathbb R^n) & \longrightarrow & \mathcal S^+(\mathbb R^n) \\ & M & \longmapsto & \sqrt{M} \end{array}$$

$$\begin{array}{l|rcl} f_3 : & \mathcal M_{n} & \longrightarrow & \mathbb R \\ & M & \longmapsto & \det M \end{array}$$

And finally

$$\begin{array}{l|rcl} f : & \mathcal M_{m,n}(\mathbb R) & \longrightarrow & \mathbb R \\ & F & \longmapsto & \det \sqrt{F^T F} \end{array}$$

You want to compute the Fréchet derivative $f^\prime$ of $f$. Applying the chain rule twice, you have: $$f^\prime(F)=(f_3^\prime((f_2 \circ f_1)(F))) \circ f_2^\prime(f_1(F)) \circ f_1^\prime(F)$$

Now you have to find the Fréchet derivative of $f_1$, $f_2$ and $f_3$.

You have (Jacobi Formula): $$f_3^\prime(F).H = \text{tr}(\text{adj}(F)H)$$ and (based on the derivative of a bilinear function between vector spaces)

$$f_1^\prime(F).H=F^T H + H^T F$$

The most difficult is to compute $f_2^\prime$. That can be done using Implicit function theorem, knowing that $$(f_2(F))^2=F^T F$$ See Derivative (or differential) of symmetric square root of a matrix for more details on that last point.