1

I found the following identity in the Matrix Cookbook:

\begin{align*} \frac{\partial }{\partial X} \text{Trace}\left( X B X ^{\mathrm{T}} \right) = XB^{\mathrm{T}} + XB, \end{align*}

where $X\in\mathbb{R}^{n\times m}$ and $B\in\mathbb{R}^{m \times m}$. Any hints on how I prove this?

BasicUser
  • 747

2 Answers2

1

It helps to expand the matrix product:

$\begin{align*} Trace(XBX^{T})&=\sum_{i}(XBX^{T})_{ii}=\sum_{i}\sum_{j}X_{ij}(BX^{T})_{ji}=\sum_{i}\sum_{j}X_{ij}\sum_{k}B_{jk}X^{T}_{ki}=\\ &=\sum_{i}\sum_{j}\sum_{k}X_{ij}B_{jk}X^{T}_{ki}=\sum_{i}\sum_{j}\sum_{k}X_{ij}B_{jk}X_{ik} \end{align*}$

Then

$\begin{align*} \frac{{\partial}}{{\partial}(X_{ij},X_{ik})}Trace(XBX^{T})&=\sum_{k}B_{jk}X_{ik}\,+\,\sum_{j}X_{ij}B_{jk}=\sum_{k}B^{T}_{kj}X_{ik}\,+\,\sum_{j}X_{ij}B_{jk}\\ &=(XB^{T})_{ij}\,+\,(XB)_{ik} \end{align*}$

And summing over all partial derivatives gives the result.

user96233
  • 965
1

A solution avoiding to use the coordinates.

Note $f : \mathbb{R}^{n\times m} \to \mathbb R$ where $$f(X) = \text{Trace}\left( X B X ^{\mathrm{T}} \right)$$

You have $f=h \circ g$ where $g(X) = X B X ^{\mathrm{T}}$ and $h(Y) = \text{Trace}(Y)$.

According to the chain rule you have $$(h \circ g)^\prime (X) = (h^\prime(g(X)) \circ g^\prime(X)$$

$h$ is a linear map. So its (Fréchet) dérivative is $$h^\prime(Y).V = \text{Trace}(V)$$

On its side, $g$ is a bilinear map. So we have $$g^\prime(X).U = U B X ^{\mathrm{T}} + X B U ^{\mathrm{T}}$$

Using the chain rule with the above derivatives, you get $$f^\prime(X).U = \text{Trace}(U B X ^{\mathrm{T}} + X B U ^{\mathrm{T}})$$

As $\text{Trace}$ is linear and $\text{Trace}(X)=\text{Trace}(X^{\mathrm{T}})$ for all square matrix, we also have

$$f^\prime(X).U = \text{Trace}((X B^{\mathrm{T}} + X B)U ^{\mathrm{T}})$$

Which is exactly the result you're looking for.