For an $n$ dimension Hermitian matrix $X$, suppose its eigenvalues are $\lambda_1,\lambda_2,...,\lambda_n$,$Tr(|X|)=\sum_i|\lambda_i|$. Is there any formula for $\frac{\partial Tr(|X|)}{\partial X}=\left(\frac{\partial Tr(|X|)}{\partial x_{ij}}\right)_{ij}$ on every differentiable point (that is $\forall i,\lambda_i\neq 0$)? For example, when $n=1$, that is $sgn(x)$ when $x\neq 0$.
-
Check out this paper by Marcus Carlsson about differentiating the modulus of a matrix. And since the trace of the modulus is the Nuclear Norm, this SE post is relevant. – greg Nov 25 '24 at 13:01
1 Answers
Be careful about the fact that if you work with Hermitian matrices, you can't differentiate with respect to $x_{ij}$ since $X + \varepsilon E_{ij}$ is not Hermitian when $\varepsilon \neq 0$ and $i \neq j$. (Here, $E_{ij}$ is the matrix whose only non-zero coefficient is a $1$ at position $(i,j)$). You can compute the differential with respect to $X$ Hermitian or partial derivatives with respect to $x_{ii}$, $x_{ij} + x_{ji}$ ($i \neq j$) and $\mathbb{\textbf{i}}(x_{ij} - x_{ji})$ ($i \neq j$) for example.
As in the case $n = 1$, the easiest method is to separate the case of negative and positive values. It is the same with $n$ variables. Set for all integer $0 \leqslant k \leqslant n$, $H_{k,n}$ to be the space of $n \times n$ Hermitian matrices with $k$ positive eigenvalues and $n - k$ negative eigenvalues so each $H_{k,n}$ is an open subset of the space of $n \times n$ Hermitian matrices. Actually, they are the connected components of the set of invertible Hermitian matrices.
Let $X \in H_{k,n}$, let $V_+ \subset \mathbb{C}^n$ be the sum of the eigenspaces of $X$ of positive eigenvalues and $V_- \subset \mathbb{C}^n$ the sum of the eigenspaces of $X$ of negative eigenvalues. By the assumption $X \in H_{k,n}$, $\dim(V_+) = k$, $\dim(V_-) = n - k$ and $V_+ \oplus V_- = \mathbb{C}^n$. Moreover, since $X$ is Hermitian, this direct sum is orthogonal and we have, $$ X = \begin{pmatrix} P & 0 \\ 0 & N \end{pmatrix}, $$ in $\mathbb{C}^n = V_+ \oplus V_-$. Therefore, $P$ is Hermitian definite positive and $N$ is Hermitian definite negative. We see immediately that $\mathbb{tr}(|X|) = \mathbb{tr}(P) - \mathbb{tr}(N)$. Clearly, for all $A,B$ Hermitian, $$ \mathbb{tr}\left(\left|X + \varepsilon\begin{pmatrix} A & 0 \\ 0 & B \end{pmatrix}\right|\right) = \mathbb{tr}(|X|) + \varepsilon(\mathbb{tr}(A) - \mathbb{tr}(B)), $$ whenever $|\varepsilon|$ is small enough. And when $C$ is a $k \times (n - k)$ complex matrix, set, $$ X_\varepsilon = \begin{pmatrix} P & \varepsilon C \\ \varepsilon C^\dagger & N \end{pmatrix}. $$ By the determinant by blocs formula, \begin{align*} \chi_{X_\varepsilon}(T) & = \det(TI_n - X_\varepsilon)\\ & = \begin{vmatrix} TI_k - P & -\varepsilon C \\ -\varepsilon C^\dagger & TI_{n - k} - N \end{vmatrix}\\ & = \det(TI_k - P - \varepsilon C(TI_{n - k} - N)^{-1}\varepsilon C^\dagger)\det(TI_{n - k} - N)\\ & = \det(TI_k - P - \varepsilon^2C(TI_{n - k} - N)^{-1}C^\dagger)\det(TI_{n - k} - N). \end{align*} It is a smooth function of $\varepsilon^2$ so the coefficients of this polynomial are smooth functions of $\varepsilon^2$. Since the eigenvalues of $X$ (i.e. the roots of $\chi_X$) are non-zero, the eigenvalues of $X_\varepsilon$ (i.e. the roots of $\chi_{X_\varepsilon}$) are, for $\varepsilon$ close to zero, a smooth function of $\varepsilon^2$. We deduce that, $$ \mathrm{tr}(|X_\varepsilon|) = \mathrm{tr}(|X|) + \mathrm{O}(\varepsilon^2). $$ Therefore, in general, $$ d\mathrm{tr}(|X|)Y = \mathrm{tr}(Y_1) - \mathrm{tr}(Y_3), $$ if, $$ Y = \begin{pmatrix} Y_1 & Y_2 \\ Y_2^\dagger & Y_3 \end{pmatrix}, $$ in the decomposition $\mathbb{C}^n = V_+ \oplus V_-$. Let $\Pi_X$ be the orthogonal projection onto $V_+$ so $I_n - \Pi_X$ is the orthogonal projection onto $V_-$. $\Pi_X$ is entirely determined by $X$. We have, $$ \mathrm{tr}(Y\Pi_X) = \mathrm{tr}(Y_1), \qquad \mathrm{tr}(Y(I_n - \Pi_X)) = \mathrm{tr}(Y_3). $$ Finally, $$ d\mathrm{tr}(|X|)Y = \mathrm{tr}(Y\Pi_X) - \mathrm{tr}(Y(I_n - \Pi_X)) = \mathrm{tr}(Y(2\Pi_X - I_n)), $$ or, with the usual scalar product, $$ \nabla\mathrm{tr}(|X|) = 2\Pi_X - I_n. $$ Notice that $X \mapsto \Pi_X$ is a smooth map onto the space of orthogonal projections of rank $k$. In particular, $\Pi_X = I_n$ if $k = n$, $\Pi_X = 0$ if $k = 0$. Finally, notice that $\nabla\mathrm{tr}(|X|)$ is always an orthogonal symmetry whose space of fixed point has dimension $k$.
- 9,220