0

If A,B,C,X are matrices, find: $$\frac{\partial \text{tr}[AXBXC^T]}{\partial X}.$$

Here's my initial approach:

$$\partial \text{d} \text{tr}[AXBXC^T] = \text{tr}[\text{d}(AXB)XC^T] + \text{tr}[AXB \text{d} (XC^T)] = \text{tr}[(XC^T(B^T \otimes A) + AXB(B^T \otimes I))\text{d}X]$$

However, this approach doesn't yield a clean and clear derivative. Could someone provide a more straightforward method or correct my approach?

Thank you.

Robin
  • 6,201
Fernand
  • 29
  • what are the dimensions of the matrices A, B, C, X? Do some matrices have properties, which can be exploited (e.g., symmetry, low rank, ...)? – Dennis Marx Jun 13 '24 at 09:01
  • @DennisMarx I don't know if they are needed. Can you please take a look at this – Fernand Jun 13 '24 at 09:07
  • 1
    may the application of the chain rule is of help, i.e., $\frac{\partial}{\partial,X} \text{tr}[AXBXC^T] = \text{tr}[\frac{\partial}{\partial,X} AXBXC^T] = \text{tr}[\frac{\partial,\left(AXBXC^T\right)}{\partial,XBX} \cdot\frac{\partial,\left(XBX\right)}{\partial,X}]$ – Dennis Marx Jun 13 '24 at 09:58
  • According to this post:

    \begin{equation} \frac {\partial AXB}{\partial X}=B^T⊗A\quad \text{hence for this problem}\rightarrow \frac{\partial,\left(AXBXC^T\right)}{\partial,XBX} = C ⊗ A \end{equation}

    and regarding the second part $\frac{\partial,\left(XBX\right)}{\partial,X}$ in the matrix cookbook in (79)-(80) something similar can be found

    – Dennis Marx Jun 13 '24 at 11:00

2 Answers2

0

You might want to employ the index notation and make use of the definition of the trace and partial derivative.

Using Einstein summation convention - implied summation over repeated indices - you may rewrite your initial trace as: $$ \text{trace}\left(AXBX{C}^{T}\right)=a_{km}x_{ml}b_{lp}x_{pq}c^{T}_{qk} $$ and then use the matrix derivative definition: $$ \begin{array}{ll} \frac{\partial}{\partial x_{ij}}\left(a_{km}x_{ml}b_{lp}x_{pq}c^{T}_{qk}\right) &=& \left(a_{km}\frac{\partial x_{ml}}{\partial x_{ij}}b_{lp}x_{pq}c^{T}_{qk}\right) + \left(a_{km}x_{ml}b_{lp}\frac{\partial x_{pq}}{\partial x_{ij}}c^{T}_{qk}\right) \\ &=& \left(a_{km}\delta^{im}_{jl}b_{lp}x_{pq}c^{T}_{qk}\right) + \left(a_{km}x_{ml}b_{lp}\delta^{ip}_{jq}c^{T}_{qk}\right) \\ &=& \left(a_{ki}b_{jp}x_{pq}c^{T}_{qk}\right) + \left(a_{km}x_{ml}b_{li}c^{T}_{jk}\right) \\ &=& \left[\text{rearranging terms in products}\right] \\ &=& \left(b_{jp}x_{pq}c^{T}_{qk}a_{ki}\right) + \left(c^{T}_{jk}a_{km}x_{ml}b_{li}\right) \end{array} $$

Notice here, that the resulting elements in both terms read $(j,i)$-th elements of the corresponding matrices, whereas we are seeking for $(i,j)$, as we take the derivative w.r.t. $x_{ij}$. To make the indices conform, you just need to transpose both terms.

0

By invariance of the trace under rotation of the factors and cancelling of any basis change matrices inside, in Einstein notation in any basis

$$d (A_{ij} X_{jk}B_{kl}X_{lm}C_{im})= A_{ij} dX_{jk}B_{kl}X_{lm}C_{im}+ A_{ij} X_{jk}B_{kl}dX_{lm}C_{im}= Tr\left( B \ X \ C^T \ A \ dX\right)+Tr\left(C^T \ A \ X \ B \ dX\right)$$

Roland F
  • 5,122