4

I see two different formulae.

https://www.comp.nus.edu.sg/~cs5240/lecture/matrix-differentiation-c.pdf

In this slides page 9, the derivative of $Ax$ is $A$.

But in some other documents such as http://faculty.arts.ubc.ca/vmarmer/econ627/vector%20diff.pdf. The derivative is $A^T$.

How to properly evaluate the derivative of Ax?

  • 1
    It looks to me like the second document also gives the derivative as $A$. –  Sep 14 '18 at 07:18

3 Answers3

9

Let use the definition by differential

$$f(x_0+\Delta x)=f(x_0)+f'(x_0)\Delta x+o(\Delta x)$$

and by $f(x)=Ax$

$$A(x_0+\Delta x)=Ax_0+A\Delta x$$

therefore $f'(x)=A$.

user
  • 162,563
6

The differential of $f(x)= Ax$ at a point $x_0$ is $Df(x_0)= A$.

This is because

$$\lim_{h\to 0} \frac{\|f(x_0+h) - f(x_0) - Df(x_0)h\|}{\|h\|} = \lim_{h\to 0} \frac{\|A(x_0+h) - Ax_0 - Ah\|}{\|h\|} = 0$$

mechanodroid
  • 47,570
3

Its either, depending on your definition of a derivative. for concreteness suppose $x$ is $n\times 1$ and $A$ is $1\times n$. Then $f(x) =Ax$ is scalar valued, so the derivative $\frac{df}{dx}$ is a gradient vector, but what shape should a gradient vector be? This is answered by choosing how you prefer to state first order taylor approximation (the defining property of a derivative). The first of two choices is

$$ f(x+h) \approx f(x) + \frac{df}{dx}(x)\cdot h$$ i.e. $ f(x+h) \approx f(x) + (\frac{df}{dx})^T h$, which corresponds to $\frac{df}{dx}(x)=A^T$ having the same shape as $h$. The other alternative is to enforce

$$ f(x+h) \approx f(x) + \frac{df}{dx}(x) h $$ which corresponds to $\frac{df}{dx}=A $ having the shape of the transpose of $h$, $1\times n$.

Calvin Khor
  • 36,192
  • 6
  • 47
  • 102