0

Let $f: \mathbb{R}^{m\times n} \mapsto \mathbb{R}$ be a mapping from an $m$-by-$n$ matrix to the real numbers. Defining the derivative of $f$ to be:
$$\nabla_A f(A) = \begin{pmatrix} \frac{\partial f}{\partial A_{11}} & \ldots & \frac{\partial f}{\partial A_{1n}} \\ \vdots & \ddots & \vdots \\ \frac{\partial f}{\partial A_{m1}} & \ldots & \frac{\partial f}{\partial A_{mn}}\end{pmatrix}, $$ I am given that $$\nabla_A \mathrm{tr}(ABA^T C) = CAB + C^T A B^T. $$
This was stated in CS229 - Machine Learning without proof. How would I prove this? Is there an easy way (the notes say that it should be simple)?

Edit: I seem to have found a counter-example. If I let $A = B = C = I$, the identity matrix, the left hand side gives me $I$ and the right-hand-side gievs me $2I$. However, throughout the course I've used this rule many times without fail. Have I missed something or is the formula typo-ed?
Also, $ABA^TC$ requires $A$ to be $m\times n$, $B$ to be $n\times n$, $C$ to be $m\times n$ for $ABA^TC$ to be $m\times n$. The right-hand-side has the product $CAB$ which doesn't make sense under these dimensions.

  • 1
    check out https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf page 8 onward has many derivatives, so you can combine those results and a chain rule to get what you need. – them Aug 10 '19 at 13:15
  • 1
    See this https://math.stackexchange.com/questions/1861213/differentiating-mboxtr-abatc-w-r-t-a?rq=1. Also they should be square matrices... – user550103 Aug 10 '19 at 15:23

0 Answers0