1

Let's say I have a matrix like below:

$$ W = \begin{bmatrix} w_{1,1} & w_{1,2} \\ w_{2,1} & w_{2,2} \end{bmatrix} $$ $$ \vec{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} $$ $$ \vec{y} = W\vec{x} = \begin{bmatrix} w_{1,1}x_1 + w_{1,2}x_2 \\ w_{2,1}x_1 + w_{2,2}x_2 \end{bmatrix} $$

then, how do I calculate the $\partial {\vec{y}} \over \partial {W}$?

  • According to Wikipedia.org [https://en.wikipedia.org/wiki/Matrix_calculus#Other_matrix_derivatives], there is no consensus about the definition of a derivative of a vector by a matrix. Can you be more specific? Are you interested for instance in the 3-indexes tensor $\frac{\partial (\vec y)i}{\partial W{jk}}$? – IgnoranteX Aug 12 '20 at 14:17
  • I have a hunch that you are trying to use the Chain Rule to calculate something, and you think you need this derivative in order to do that. If this is case, you'd be better off performing the calculation without using the chain rule. – greg Aug 12 '20 at 14:37
  • @M.Marciani Yes, exactly. I read the material about vector derivatives, but I didn't understand about dealing with more than two dimensions. Specifically, I can't imagine the result that 3-indices tensor $\partial{\vec{y}_i} \over \partial{W_jk}$. – YeongHwa Jin Aug 13 '20 at 05:09
  • @greg yeah, now I'm studying about Back Propagation in Neural Network, that is why I wonder it.But, unfortunately, I didn't understand your saying that "calculation without using chain rule". Why is better? – YeongHwa Jin Aug 13 '20 at 05:15

0 Answers0