Hello I am trying to compute the derivative of the equation
$f = x^TAx $
I can rewrite it to
$f = \sum_{i,j} A_{ij}x_ix_j $
Then find its derivative
$\frac{df}{dx_k} = \sum_j A_{kj}x_j + \sum_i A_{ik}x_i $
How do I rewrite this in matrix form ?
Hello I am trying to compute the derivative of the equation
$f = x^TAx $
I can rewrite it to
$f = \sum_{i,j} A_{ij}x_ix_j $
Then find its derivative
$\frac{df}{dx_k} = \sum_j A_{kj}x_j + \sum_i A_{ik}x_i $
How do I rewrite this in matrix form ?
A simple way to get this result is using matrix calculus. You can prove that if $M$ is a matrix and $v$ is a column vector, then as you'd expect, $\frac{\partial }{\partial v}(Mv) = M$. Second, you can prove as a special case that if $a$ and $b$ are column vectors, then $a^Tb$ and $ab^T$ are equal as scalars, and: $$\frac{\partial}{\partial b} (b^T a) = \frac{\partial}{\partial b} (a^T b) = a^T$$
From there, all you need is the observation that $(Ax)$ is a column vector. Hence put $w(x) \equiv Ax$. Using the product rule, we find:
$$\frac{\partial }{\partial x} x^T A x = \frac{\partial }{\partial x} x^T w = x^T \frac{\partial w}{\partial x} + w^T = x^TA + (Ax)^T = x^T A + x^T A^T = x^T(A+A^T)$$
$$Mv = \left[\sum_{j} M_{ij}\cdot v_j\right]_i$$
We'll use this rule to convert your answer from a summation to a matrix multiplication.
$$\begin{bmatrix}\partial_1 f& \partial_2 f & \cdots & \partial_k f & \cdots &\partial_n f\end{bmatrix} = \left[\sum_{j} A_{kj}\cdot x_j\right]_k + \left[\sum_{i} A_{ik}\cdot x_i\right]_k$$
There are two terms in this equation. The first term matches our rule for matrix multiplication: we can rewrite $$\left[\sum_{j} A_{kj}\cdot x_j\right]_k \Rightarrow A\cdot x$$
The second term doesn't match exactly, because the summation is over the first index $i$ instead of the second index $k$. To fix this, we'll take a transpose.
$$ \left[\sum_{i} A_{ik}\cdot x_i\right]_k = \left[\sum_{i} A^T_{ki}\cdot x_i\right]_k \Rightarrow A^T \cdot x$$
In short,
$$\frac{\partial f}{\partial x} = \frac{\partial(x^T A x)}{\partial x} = (A + A^T)\cdot x$$
Although because the derivative is expected to be a row vector, it may be more appropriate to write the transpose of this:
$$\frac{\partial f}{\partial x} = \frac{\partial(x^T A x)}{\partial x} = x^T\cdot (A^T + A)$$
In index notation, you can write $$\frac{\partial f}{\partial x_k}=A_{kj}x_j + A_{ik}x_i$$ where a summation is implied by the presence of a repeated index.
You can also change a summation index (aka a dummy index), without altering the result, e.g. $(x_iy_i = x_ky_k)$.
So let's change both dummy indices to $p$ yielding $$\eqalign{ \frac{\partial f}{\partial x_k} &= A_{kp}x_p + A_{pk}x_p \cr &= (A_{kp}+A_{pk})\,x_p \cr }$$ which in matrix notation would be written as $\,(A+A^T)\,x$