4

I'm trying to take the derivative of a 4th order equation with respect to a matrix. It has the following form

$$\frac{\displaystyle \partial \bf a^T X^T X X^T X b}{\displaystyle \partial \bf X} = \Large ?$$

$\bf a$ and $\bf b$ are vectors and $\bf X$ is a matrix so, in effect, it's the derivative of a scalar with respect to a matrix.

I found the basic derivatives in Matrix Calculus on Wikipedia and I found the second order derivative in The Matrix Cookbook. This gives me the solution for the second order case

$$\frac{\displaystyle \partial \bf a^T X^T X b}{\displaystyle \partial \bf X} = \bf X (ab^T+ba^T)$$

I wonder if there is a similar solution for the 4th order case?

2 Answers2

4

For convenience, define a new matrix variable $$\eqalign{ M &= X^TX = M^T \cr }$$ Write the function in terms of the Frobenius (:) Inner Product and this new variable. Now finding the differential and gradient is straighforward. $$\eqalign{ f &= ab^T:MM^T \cr\cr df &= ab^T:(dM\,M^T+M\,dM^T) \cr &= ab^T:2\,{\rm sym}(dM\,M^T) \cr &= 2\,{\rm sym}(ab^T):dM\,M^T \cr &= (ab^T+ba^T):dM\,M^T \cr &= (ab^T+ba^T)M:dM \cr &= (ab^T+ba^T)M:2\,{\rm sym}(X^TdX) \cr &= \Big(M(ab^T+ba^T)+(ab^T+ba^T)M\Big):X^TdX \cr &= \Big(XM(ab^T+ba^T)+X(ab^T+ba^T)M\Big):dX \cr\cr \frac{\partial f}{\partial X} &= XM(ab^T+ba^T)+X(ab^T+ba^T)M\cr &= XX^TX(ab^T+ba^T)+X(ab^T+ba^T)X^TX\cr\cr }$$

greg
  • 40,033
  • 1
    Hi, what does sym mean here? Thanks. – Jeff Faraci Jan 18 '17 at 05:08
  • 2
    @Integrals It is a function used to decompose a matrix into the sum of its symmetric and skew parts $$\eqalign{\operatorname{sym}(A)&=\frac{1}{2}(A+A^T)\cr \operatorname{skew}(A)&=\frac{1}{2}(A-A^T)\cr A&=\operatorname{sym}(A)+\operatorname{skew}(A)\cr}$$ – greg Jan 18 '17 at 15:26
1

Let $f : \mathbb R^{m \times n} \to \mathbb R$ be defined by

$$f (\mathrm X) = \mathrm a^T \mathrm X^T \mathrm X \mathrm X^T \mathrm X \mathrm b$$

The directional derivative of $f$ in the direction of $\mathrm V$ at $\mathrm X$ is

$$\begin{array}{rl} D_{\mathrm V} f (\mathrm X) &= \mathrm a^T \mathrm V^T \mathrm X \mathrm X^T \mathrm X \mathrm b + \mathrm a^T \mathrm X^T \mathrm V \mathrm X^T \mathrm X \mathrm b + \mathrm a^T \mathrm X^T \mathrm X \mathrm V^T \mathrm X \mathrm b + \mathrm a^T \mathrm X^T \mathrm X \mathrm X^T \mathrm V \mathrm b\\ &= \mbox{tr} (\mathrm a^T \mathrm V^T \mathrm X \mathrm X^T \mathrm X \mathrm b) + \mbox{tr} (\mathrm a^T \mathrm X^T \mathrm V \mathrm X^T \mathrm X \mathrm b) + \mbox{tr} (\mathrm a^T \mathrm X^T \mathrm X \mathrm V^T \mathrm X \mathrm b) + \mbox{tr} (\mathrm a^T \mathrm X^T \mathrm X \mathrm X^T \mathrm V \mathrm b)\\ &= \mbox{tr} (\mathrm V^T \mathrm X \mathrm X^T \mathrm X \mathrm b \mathrm a^T) + \mbox{tr} (\mathrm X^T \mathrm X \mathrm b \mathrm a^T \mathrm X^T \mathrm V) + \mbox{tr} (\mathrm V^T \mathrm X \mathrm b \mathrm a^T \mathrm X^T \mathrm X) + \mbox{tr} (\mathrm b \mathrm a^T \mathrm X^T \mathrm X \mathrm X^T \mathrm V)\\ &= \langle \mathrm V, \mathrm X \mathrm X^T \mathrm X \mathrm b \mathrm a^T \rangle + \langle \mathrm X \mathrm a \mathrm b^T \mathrm X^T \mathrm X, \mathrm V \rangle + \langle \mathrm V, \mathrm X \mathrm b \mathrm a^T \mathrm X^T \mathrm X \rangle + \langle \mathrm X \mathrm X^T \mathrm X \mathrm a \mathrm b^T, \mathrm V \rangle\end{array}$$

Thus,

$$\begin{array}{rl} \nabla_{\mathrm x} f (\mathrm X) &= \mathrm X \mathrm X^T \mathrm X \mathrm b \mathrm a^T + \mathrm X \mathrm a \mathrm b^T \mathrm X^T \mathrm X + \mathrm X \mathrm b \mathrm a^T \mathrm X^T \mathrm X + \mathrm X \mathrm X^T \mathrm X \mathrm a \mathrm b^T\\ &= \mathrm X \mathrm X^T \mathrm X (\mathrm a \mathrm b^T + \mathrm b \mathrm a^T) + \mathrm X (\mathrm a \mathrm b^T + \mathrm b \mathrm a^T) \mathrm X^T \mathrm X \end{array}$$

  • Can you please share some resource to help me understand what's going on here. – learner Feb 22 '18 at 22:48
  • @learner Search for "directional derivative" and "Frobenius inner product". Note how the directional derivative is the inner product of the gradient and the direction vector. The rest should follow. – Rodrigo de Azevedo Feb 23 '18 at 10:56