Define matrix $\mathbf{A} \in \mathbb{C}^{n \times k}$, matrix $\mathbf{B} \in \mathbb{C}^{m \times k}$, and vector $\mathbf{c} \in \mathbb{C}^{k \times 1}$. Also, assume that $\mathbf{A}$ and $\mathbf{c}$ are independent of $\mathbf{B}$. What is the derivative of $\mathbf{A}\mathbf{B}^T\mathbf{B}\mathbf{c}$ respect to $\mathbf{B}$.
-
1I've doubts this is differentiable, i.e. the main question is the differntiability of Transpose, which doesn't look to me a continuous function for a first glance. – Michael Medvinsky Oct 21 '15 at 13:41
2 Answers
$\def\C{\mathbf C}\def\M#1#2{\operatorname{Mat}_{#1,#2}(\C)}$Denote by $f \colon \M mk \to \C^n$ the function in question, that is $$ f(B) = AB^tBc $$ We have for any $B, H \in \M mk$ that \begin{align*} f(B+H) - f(B) &= A(B^t + H^t)(B + H)c - AB^tBc\\ &= AH^tBc + AB^tHc + AH^tHc \end{align*} Now, note that $\def\norm#1{\left\|#1\right\|}$by submultiplikativity and continuity of $B \mapsto B^t$, we have $$ \norm{AH^tHc} \le \norm A \norm H^2 \norm c = o(\norm H), \qquad H \to 0$$ That is, the (Frechet) derivative of $f$ at $B$ is the function $$ f'(B) \colon \M mk \to \C^n, \qquad H \mapsto AH^t Bc + AB^tHc $$
- 86,011
-
can you please explain or provide a reference to why transpose is continuous and differentiable function? – Michael Medvinsky Oct 21 '15 at 13:47
-
The transpose is a linear map $(-)^t \colon \M nm \to \M mn$, every linear map $T$ between finite dimensional vector spaces is continuous (and hence differentiable with $T'(x) = T$, all $x$), see for example here http://math.stackexchange.com/questions/112985/every-linear-mapping-on-a-finite-dimensional-space-is-continuous ... and as far as it matters for $f$, we've shown it above. – martini Oct 21 '15 at 13:49
-
thanks, note in the second equation, last term is fixed should be $AH^tHc$ – Michael Medvinsky Oct 21 '15 at 13:57
-
-
ahem.... the transpose is not a linear map http://math.stackexchange.com/questions/1143614/is-matrix-transpose-a-linear-transformation – Michael Medvinsky Oct 21 '15 at 14:00
-
The transpose is linear for sure, as $(\mu A+\lambda B)^t = \mu A^t + \lambda B^t$, which is the definition of being linear. – martini Oct 21 '15 at 14:02
-
"linear map" and linear is not the same thing, linear map would mean that there is a matrix $M$ such that for all matrix $A$ , $A^T=MA$ – Michael Medvinsky Oct 21 '15 at 14:04
-
also the link you have provided states the continuity of a linear map, but not the differentiability. – Michael Medvinsky Oct 21 '15 at 14:26
$ \def\R#1{{\mathbb R}^{#1}} \def\LR#1{\left(#1\right)} \def\BR#1{\left[#1\right]} \def\CR#1{\left\lbrace #1 \right\rbrace} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\q{\quad} \def\qq{\qquad} \def\qif{\q\iff\q} \def\qiq{\q\implies\q} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} \def\T{{\sf T}} \def\B{B_{ij}} \def\E{{\boldsymbol{\cal E}}} \def\Eij{\E_{ij}} \def\Eji{\E_{ji}} $The gradient of a matrix with respect to one of its elements is $$\eqalign{ \grad{B}{\B} = \Eij \qiq \grad{B^\T}{\B} = \Eij^\T = \Eji \qq \qq \; }$$ All component of $\Eij\in\R{m\times k}\,$ equal zero, except the $(i,j)^{th}$ component which equals $\tt1$.
Use this result to calculate the component-wise gradient of your vector-valued function $$\eqalign{ v &= AB^\T Bc \qiq \grad v\B &= A\Eji Bc \,+\, AB^\T\Eij\,c \\ }$$ The full gradient is a third-order tensor, which can be calculated as the double sum $$\eqalign{ \grad vB = \sum_{i=1}^m\sum_{j=1}^k \gradLR y\B\star\Eij \q\in\;\R{n\times m\times k} }$$ where $\star$ denotes the dyadic/tensor product.
- 40,033