2

I would like to take the derivative of the following expression w.r.t. the matrix $A \in \mathbb{R}^{m \times n}$, i.e.,

$$ \frac{\partial \big( \sum_{i=1}^m (Ax)_i \big)}{\partial A}, $$

where $x \in \mathbb{R}^n$. The second answer here gives the derivative of the matrix-vector product w.r.t. the matrix, but, I wasn't sure how it changes with the summation? Though I think that it should work out cleanly since derivative and summation are linear operators and can be interchanged? I am not sure about how the indices would change so any advice regarding that would be much appreciated.

2 Answers2

3

First note that $\sum_i (Ax)_i = {\bf 1}^T A x$ where $\bf 1$ is a vector of ones. Moreover, $\frac{\partial}{\partial A}y^TAx = yx^T$, so

$$\frac{\partial (\sum_i (Ax)_i)}{\partial A} = {\bf 1}\cdot x^T =\begin{pmatrix} x_1&x_2&\cdots&x_n\\ x_1&x_2&\cdots&x_n\\ \vdots\\ x_1&x_2&\cdots&x_n\\ \end{pmatrix}$$

Hyperplane
  • 12,204
  • 1
  • 22
  • 52
0

Thanks to the summation over $i$, you're differentiating a quantity with no indices with respect to a matrix, so obtain a matrix. Its components are$$\frac{\sum_{il}\partial(A_{il}x_l)}{\partial A_{jk}}=\sum_{il}\delta_i^j\delta_k^lx_l=x_k.$$This agrees with @Hyperplane's result $1\cdot x^T$.

J.G.
  • 118,053