1

While going through some notes on matrices, I stumbled upon the following

$$\nabla_A \mbox{tr}(AB) = B^T$$

where $A$ and $B$ are square matrices of the same size. The trace operator returns a real number. How does the derivative of a scalar result in a matrix?

aroma
  • 492
  • 4
  • 14

1 Answers1

4

Let $A,B$ be matrices such that their product $AB$ is a square matrix.

Another way to write the trace is to use the inner product (:) notation, i.e. $${\rm tr}(AB) = A^T:B$$ Because of the cyclic property of the trace, this can also be written as $B^T:A$

So the differential of the function is simply $$d\,{\rm tr}(AB) = A^T:dB + B^T:dA$$ Note that each term on the RHS is an inner product of two matrices, yielding a scalar result.

Depending on which variable you take to be the independent variable, the gradient is either $$\nabla_B\,{\rm tr}(AB) = A^T$$ $$\nabla_A\,{\rm tr}(AB) = B^T$$

lynn
  • 3,441