0

Let $A$ be a matrix of size $n\times n$ and $x$ a $n \times 1$ vector. Consider $y = x'AA'x,$ where $y$ is a scalar. I want to compute the derivatives of $y$ with respect to $A'$.


My attempt : write $y$ as

$$ y = \sum_{i=1}^n \sum_{j=1}^n \sum_{k=1}^n x_i a_{ik} a_{jk} x_j$$

and take derivative of it w.r.t $a_{j,i}$, where $a_{j,i}$ is the $i,j$th element of the matrix $A'$. Is it true? If yes, how to get the derivatives?

One possible (tedious) solution?

\begin{align*} \dfrac{\partial x'AA'x}{\partial A'} &= \begin{bmatrix} \dfrac{\partial x'AA'x}{\partial a_{11}} & \dfrac{\partial x'AA'x}{\partial a_{21}} & \cdots & \dfrac{\partial x'AA'x}{\partial a_{n1}}\\ & & \\ \dfrac{\partial x'AA'x}{\partial a_{12}} & \dfrac{\partial x'AA'x}{\partial a_{22}} & \cdots & \dfrac{\partial x'AA'x}{\partial a_{n2}}\\ \vdots & \vdots & \cdots & \vdots\\ \dfrac{\partial x'AA'x}{\partial a_{1n}} & \dfrac{\partial x'AA'x}{\partial a_{2n}} & \cdots & \dfrac{\partial x'AA'x}{\partial a_{nn}} \end{bmatrix}\\ &= \begin{bmatrix} 2x_1\sum\limits_{i=1}^n a_{i1}x_i & 2x_2\sum\limits_{i=1}^n a_{i1}x_i & \cdots & 2x_n\sum\limits_{i=1}^n a_{i1}x_i \\ & & \\ 2x_1\sum\limits_{i=1}^n a_{i2}x_i & 2x_2\sum\limits_{i=1}^n a_{i2}x_i & \cdots & 2x_n\sum\limits_{i=1}^n a_{i2}x_i \\ \vdots & \vdots & \cdots & \vdots\\ 2x_1\sum\limits_{i=1}^n a_{in}x_i & 2x_2\sum\limits_{i=1}^n a_{in}x_i & \cdots & 2x_n\sum\limits_{i=1}^n a_{in}x_i \end{bmatrix}\\ &= 2 \begin{bmatrix} A_{1.}' x x_1 & A_{1.}' x x_2 & \cdots & A_{1.}' x x_n\\ & & \\ A_{2.}' x x_1 & A_{2.}' x x_2 & \cdots & A_{2.}' x x_n\\ \vdots & \vdots & \cdots & \vdots\\ A_{n.}' x x_1 & A_{n.}' x x_2 & \cdots & A_{n.}' x x_n \end{bmatrix}\\ &= 2 A'x [ x_1\; x_2\; \cdots\; x_n]\\ &= 2 A'xx', \end{align*} where $A'_{j.}$ refers to the $j$th row of $A'$.

Aqua
  • 3

2 Answers2

0

The derivative with respect to $a_{pq}$ is$$x_pa_{jq}x_j+x_ia_{iq}x_p=2x_ix_pa_{iq}=2(xx^\prime)_{pi}a_{iq}=(2xx^\prime A)_{pq.}$$

J.G.
  • 118,053
0

Let scalar field $f : \mathbb R^{n \times n} \to \mathbb R$ be defined by

$$f ({\rm X}) := {\rm a}^\top {\rm X}^\top {\rm X} \, {\rm a} = \mbox{tr} \left( {\rm a}^\top {\rm X}^\top {\rm X} \, {\rm a} \right) = \mbox{tr} \left( {\rm X} \, {\rm a} {\rm a}^\top {\rm X}^\top \right)$$

whose gradient is

$$\nabla f ({\rm X}) = \color{blue}{2 {\rm X} \, {\rm a} {\rm a}^\top}$$