23

I am working on an old AMM problem:

Suppose $A,B$ are $n\times n$ real symmetric matrices with $\operatorname{tr} ((A+B)^k)= \operatorname{tr}(A^k) + \operatorname{tr}(B^k) $ for every positive integer $k>0$. Prove that $AB=0$.

I've done the following:

  • denote $(\lambda_i),(\mu_i),(\eta_i)$ the eigenvalues of $A,B,A+B$, respectively. They are real since $A,B$ are symmetric. Moreover, since $A,B$ are diagonalizable, if all $\lambda_i$ or all $\mu_j$ are zero, then $A$ or $B$ is zero and $AB=0$.

  • suppose there exist some non-zero eigenvalues; then the identity given translates in $$ \sum_{i=1}^n \lambda_i^k +\sum_{i=1}^n \mu_i^k=\sum_{i=1}^n \eta_i^k \qquad \forall k >0 $$ from here I can prove that any non-zero eigenvalue $\eta_i$ can be found in the LHS also with the same multiplicity (divide by the one with the greatest absolute value and take the limits $k \to \infty$, $k$ odd and $k \to \infty$, $k$ even). And therefore in the LHS there are at least $n$ eigenvalues equal to zero.

  • so $A$ has $p$ zero eigenvalues and $B$ has $n-p$ zero eigenvalues.

I feel like it is not long until the end, but I can't get any further ideas. How to proceed from here?

Srivatsan
  • 26,761
Beni Bogosel
  • 23,891
  • The AMM publishes solutions to its problems. What issue did this problem appear in? –  Dec 03 '11 at 16:02
  • The number of the problem is AMM 11483. I didn't search for the solution, since I thought I can solve it. If the answer does not turn up, I will search the published solution. – Beni Bogosel Dec 03 '11 at 18:58
  • 9
    This is not such an old problem. It appeared in the Feb. 2010 issue (in April 2010, it was corrected: "non-negative" was changed to "positive"). The problem is still "current", in the sense that the solution has not yet been published. –  Dec 03 '11 at 19:15
  • official solution available here: https://www.jstor.org/stable/10.4169/amer.math.monthly.119.02.161 – user8675309 Jun 06 '24 at 20:21
  • @user8675309 I haven’t access to the Feb 2012 issue of AMM. Do the published solutions have similar ideas to the answers below? – user1551 Jun 27 '25 at 13:22
  • @user1551 as of now (and a year ago when I posted the link) you can access 100 free JSTOR articles per month -- that's why I dropped the JSTOR link in. They only state one solution in it which is more or less the same thing as what you posted which is more or less the same thing as what cazanova posted, though their solution uses Hadamard's Determinant Inequality like you. Beyond the editors, there were only 3 solvers mentioned and unlike many other problems, no alternative proof approaches were mentioned. – user8675309 Jun 27 '25 at 15:16

6 Answers6

10

Update: A solution appears on page 167 of the February 2012 issue of the American Mathematical Monthly. There were only three solvers, so we can deduce that this question was unusually difficult for a Monthly problem.

  • Wow. Did you get an advance copy? The Feb. 2012 table of contents is not even up on the website yet. :) – cardinal Jan 20 '12 at 02:32
  • The February issue appeared just today; it's available to subscribers only through the MAA website. But, yeah, it is a bit early! –  Jan 20 '12 at 02:41
  • Hmm. Thanks. Now, if only I could remember my username, I could log in... :) – cardinal Jan 20 '12 at 02:47
  • @ByronSchmuland Hi, sorry to interrupt you but where can I found the February 2012 issue of the American Mathematical Monthly? I'm a French student. Thanks –  Apr 01 '14 at 10:54
  • @Julien If you give me your email address, I will send you a copy. –  Apr 01 '14 at 12:05
  • @Julien I have sent it. Let me know if you don't receive it. –  Apr 01 '14 at 16:42
  • @ByronSchmuland I got it. I am very grateful to you. –  Apr 01 '14 at 20:04
10

Lemma: Let $X$ and $Y$ be real symmetric $n \times n$ matrices, where the eigenvalues of $X$, $Y$ and $X+Y$ are $\alpha_1 \geq \alpha_2 \geq \cdots \geq \alpha_n$, $\beta_1 \geq \beta_2 \geq \cdots \geq \beta_n$ and $\gamma_1 \geq \gamma_2 \geq \cdots \geq \gamma_n$ respectively. Then $\sum_{i=1}^k \alpha_i + \sum_{i=1}^k \beta_i \geq \sum_{i=1}^k \gamma_i$. Moreover, if we have equality, then we can change bases so that $X$ and $Y$ are simultaneously block diagonal with blocks of size $k \times k$ and $(n-k) \times (n-k)$, and the eigenvalues of the first blocks are $(\alpha_1, \ldots, \alpha_k)$ and $(\beta_1, \ldots, \beta_k)$.

Remark: Note that I have deliberately used different variable names than Beni. One of the things that made this problem tricky for me was that I had to apply this lemma in several different ways, with different matrices playing the roles of $X$ and $Y$.

Proof: Thinking of $X$ and $Y$ as quadratic forms, it makes sense to restrict them to subspaces of $\mathbb{R}^n$. With this understanding, we have $\sum_{i=1}^k \alpha_i = \max_{V \in G(k,n)} \mathrm{Tr} X|_V$, where the max ranges over $k$-dimensional subspaces of $\mathbb{R}^n$. Moreover, the maximum is achieved precisely when $X \cdot V = V$, and the eigenvalues of $X|_V$ are $(\alpha_1, \ldots, \alpha_k)$. When all the $\alpha$'s are distinct, this can be expressed more easily by saying that $V$ is the span of the eigenvectors associated to the largest $k$ eigenvalues.

We want to show that $$\max_{V \in G(k,n)} \mathrm{Tr} X|_V + \max_{V \in G(k,n)} \mathrm{Tr} Y|_V \geq \max_{V \in G(k,n)} \mathrm{Tr} \left( X+Y \right)_V$$ or, equivalently, $$\max_{(V,W) \in G(k,n)^2} \mathrm{Tr} \left( X|_V + Y_W \right) \geq \max_{V \in G(k,n)} \mathrm{Tr} \left( X|_V +Y|_V \right)$$ But this is obvious, because the right hand maximum is over a larger set.

Moreover, if we have equality, then there is some $k$-plane $V$ with $XV=YV =V$ and where the eigenvalues of $X|_V$ and $Y|_V$ are $(\alpha_1, \ldots, \alpha_k)$ and $(\beta_1, \ldots, \beta_k)$. This is precisely the required claim about being block diagonal. $\square$

Let $A$ and $B$ be as in the problem. Let the eigenvalues of $A$ be $\lambda^{+}_1 \geq \cdots \geq \lambda^{+}_{p^{+}} > 0 =0=\cdots=0 > \lambda^{-}_{1} \geq \cdots \geq \lambda^{-}_{p_{-}}$. Define $\mu^{+}_i$, $\mu^{-}_i$, $q^{+}$ and $q^{-}$ similarly. Then, as Beni explains, we have $p^{+}+p^{-} + q^{+} + q^{-} \leq n$, and the eigenvalues of $X+Y$ are the concatenation of the $\lambda^{\pm}_i$'s, the $\mu^{\pm}_i$'s, and $n-p^{+}-p^{-} - q^{+} - q^{-}$ copies of $0$.

Apply the lemma with $k=p^{+}+q^{+}$, $X=A$ and $Y=B$. This shows that we can split into two smaller problems, one where all the $\lambda$'s and $\mu$'s are positive and another where they are all nonpositive. We will focus on the first block; the second block is similar.

So we are reduced to the case that $A$ and $B$ have no negative eigenvalues.

Now, we use induction on $n$. The base case $n=1$ is easy. Without loss of generality, let $\lambda^{+}_1 \geq \mu^{+}_1$. So $\lambda^{+}_1$ is the greatest eigenvalue of both $A$ and $A+B$.

Apply the lemma with $k=1$, $X=A+B$ and $Y=-B$. The greatest eigenvalue of $-B$ is $0$. So the greatest eigenvalue of $(A+B)+(-B)$ is the sum of the greatest eigenvalues of $A+B$ and $-B$, and we conclude that we can split off a $1 \times 1$ block from $A+B$ and $-B$, where the block of $-B$ is zero.

The product of the $1 \times 1$ blocks is $0$, and by induction the product of the other blocks is $0$ as well.

3

Using determinants combined with Hadamard's Inequality leads to a very short proof but it is insightful to interpret this via triangle inequality and the Schatten 1 norm also known as the nuclear norm or the trace norm.

Applying one of many trace identities e.g. Newton's to $\text{trace}\big((A+B)^k\big)=\text{trace}\big(A^k\big)+\text{trace}\big(B^k\big)$ for $k\in \mathbb N$ in combination with spectral theorem tells us
$Q^T\left[\begin{matrix}A +B & \mathbf 0\\ \mathbf 0 & \mathbf 0\end{matrix}\right]Q =\left[\begin{matrix}A & \mathbf 0\\ \mathbf 0 & B\end{matrix}\right]$ for some $Q\in O_{2n}(\mathbb R)$

This in particular tells us that the triangle inequality for the Schatten 1 norm is met with equality and that tells us $A$ and $B$ have a common choice of orthogonal matrix for Polar Decomposition. I.e. using (a choice of) Polar Decomposition $(A+B)=UP$ where $U\in O_n(\mathbb R)$, yields

$$\begin{aligned}\big \Vert A\big \Vert_{S_1}+\big \Vert B\big \Vert_{S_1} &=\big \Vert A+B\big \Vert_{S_1}\\ &=\text{trace}\big(P\big)\\ & =\text{trace}\big(U^T (UP)\big)\\ &=\text{trace}\big(U^T(A+B)\big) \\& =\text{trace}\big(U^TA\big)+\text{trace}\big(U^TB\big)\\& \leq \Big \vert \text{trace}\big(U^TA\big)\Big \vert+\Big \vert\text{trace}\big(U^TB\big)\Big \vert\\ &\leq \big \Vert A\big \Vert_{S_1}+\big \Vert B\big \Vert_{S_1} \end{aligned}$$ where the final inequality can more granularly be written as
$\Big \vert\text{trace}\big(U^TA\big)\Big \vert =\Big \vert\sum_{k=1}^n \lambda_k^{(U^TA)}\Big \vert\leq \sum_{k=1}^n \vert \lambda_k^{(U^TA)}\vert\leq \sum_{k=1}^n \sigma_k^{(U^TA)}=\big \Vert A\big \Vert_{S_1}$
$\Big \vert\text{trace}\big(U^TB\big)\Big \vert =\Big \vert\sum_{k=1}^n \lambda_k^{(U^TB)}\Big \vert\leq \sum_{k=1}^n \vert \lambda_k^{(U^TB)}\vert\leq \sum_{k=1}^n \sigma_k^{(U^TB)}=\big \Vert B\big \Vert_{S_1}$

Working from right to left, each upper bound is necessarily met with equality so the equality conditions of the right inequality tells us $U^TA$ and $U^TB$ are each normal (ref 'deferred proof' under Kronecker Product of Normal Matrices ) and triangle inequality is met with equality so all eigenvalues are on the same (positive) ray from the origin $\implies$ $U^TA$ and $U^TB$ are each symmetric PSD. Note this means $(U^TB)=(U^TB)^T = BU$.

Finally, returning to $\text{trace}\big((A+B)^2\big)=\text{trace}\big(A^2\big)+\text{trace}\big(B^2\big)$
$\implies 0=\text{trace}\big(AB\big)=\text{trace}\big((U^TA)(BU)\big)\implies (U^TA)(BU)=\mathbf 0$
since the product of two symmetric PSD matrices has real non-negative eigenvalues and is diagonalizable, ref A and B are real, symmetric and positive semi-definite matrices of the same order; is AB diagonalizable? -- alternatively one can check $\text{trace}\Big((U^TA)(BU)\Big)=\big\Vert (BU)^\frac{1}{2}(U^TA)^\frac{1}{2}\Big\Vert_F^2$

Conclude $AB$ is conjugate to the zero matrix $\implies AB =\mathbf 0$

user8675309
  • 12,193
1

define $ M=\left[ \begin{array}{cc} A+B & O_n\\ O_n & O_n \end{array}\right] $ and $ N=\left[ \begin{array}{cc} A & O_n\\ O_n & B \end{array}\right] $.

$ (\forall k \in \mathbb{N}^*,Tr((A+B)^k)=Tr(A^k)+Tr(B^k))$ $\Leftrightarrow$ $ (\forall k \in \mathbb{N},Tr(M^k)=Tr(N^k))$ $\Leftrightarrow \chi_M=\chi_N$ where $\chi_S$ is the characteristic polynomial of the matrix S. then the problem is equivalent to :

$X^n.\chi_{A+B}(X)=\chi_A(X).\chi_B(X) $.

denote $ \alpha=\{\alpha_1,\alpha_2,...,\alpha_p\} $ ($ \beta=\{\beta_1,\beta_2,...,\beta_q \} $ and $ \gamma=\{\gamma_1,\gamma_2,...,\gamma_r\} $) the non-zero eigenvalues of $A$ ($B$ and $A+B$ ) (not necessarily distinct). then $\gamma = \alpha \cup \beta $ and $r=p+q$. In addition, we have $Im(A+B)=Im(A)\oplus Im(B)$ (where $Im(S)$ is the range of a given matrix $S$), since we can consider just the restriction of $A$, $B$ and $A+B$ to the subspace $Im(A+B)$, then we can may assume that $Im(A+B)=\mathbb{R}^n$.

let $a=\{a_1,...,a_p\}$ ($b=\{b_1,...,b_q\}$ and $c=\{c_1,...,c_r\}$) an orthonormal basis of $Im(A)$ ($Im(B)$ and $Im(C)$) that consists of eigenvectors of $A$ ($B$ and $C$), that is $Aa_i=\alpha_i a_i$ ($Bb_i=\beta_i b_i$ and $Cc_i=\gamma_i c_i$) and let $ D_a=diag(\alpha_1,...,\alpha_p) $ and $D_b=diag(\beta_1,...,\beta_q)$. then the matrix $A+B$ can be written in the basis $c$ as $ F_1=\left[ \begin{array}{cc} D_a & O\\ O & D_b \end{array}\right] $. denote now $S=(s_{ij})_{i,j}$ where $s_{ij}=<a_i | b_j>$ for $1 \le i \le p$ and $1 \le j \le q$.

then the matrix $A+B$ can be written in the basis $a\cup b$ as $F_2=\left[ \begin{array}{cc} D_a & D_a S \\ D_b S^T & D_b \end{array}\right]=\left[ \begin{array}{cc} D_a & O \\ O & D_b \end{array}\right]\left[ \begin{array}{cc} I & S \\ S^T & I \end{array}\right] $, but $F_1$ and $F_2$ ar similar, then $det(F_1)=det(F_2)$ $\Rightarrow$ $det(I-S^T S)=det \left[ \begin{array}{cc} I & S \\ S^T & I \end{array}\right]=1$.

on the other hand, consider $ 1 \le j \le q$; $n_j=b_j - \sum_{i=1}^{p}s_{ij}a_i$, then $<n_i | n_j>=(I-S^T S)_{ij}$, therefore the matrix $I-S^T S $ is a gramian matrix, it follows that $I-S^T S $ is symmetric positive matrix, denote $ 0 \le \theta_1 \le \theta_2 \le ... \le \theta_q $ its eigenvalues.

then by the arithmetic mean geometric mean inequality: $q-\sum_{i=1}^{p}\sum_{j=1}^{q}s_{ij}^2 = Tr(I- S^T S)=\sum_{i=1}^{q}\theta_i \ge q \sqrt[q]{\prod_{i=1}^{q} \theta_i}=q$ which gives $\sum_{i=1}^{p}\sum_{j=1}^{q}s_{ij}^2 \le 0$, that is, $S=O$.

it follows that for $ 1 \le i \le p $; $Ba_i=\sum_{j=1}^{q}s_{ij} \beta_j b_j = 0$ and for $ 1 \le j \le q $; $Ab_j=\sum_{i=1}^{p}s_{ij} \alpha_i a_i = 0$

therefore we deduce easily $AB=BA=O$.

0

(This content below is migrated from an answer to a question that can be viewed as a duplicate of the current one but got closed for another reason.)

As pointed out by other answers here, the trace condition means that $\pmatrix{A+B\\ &0}$ is orthogonally similar to $\pmatrix{A\\ &B}$.

(In the study of quadratic forms, Craig-Sakamoto theorem states that, given any two real symmetric matrices $A$ and $B$, $\det(I-xA-yB)=\det(I-xA)\det(I-yB)$ for all $x,y\in\mathbb R$ if and only if $AB=0$. The statement in this old AMM problem is equivalent to Craig-Sakamoto theorem, but with a weaker assumption.)

Let $a=\operatorname{rank}(A)$ and $b=\operatorname{rank}(B)$. Since $A$ and $B$ are real symmetric, we may write $$ A=UDU^T \quad\text{and}\quad B=V\Lambda V^T $$ for some nonsingular diagonal matrices $D\in GL_a(\mathbb R),\,\Lambda\in GL_b(\mathbb R)$ and some tall matrices $U\in M_{n,a}(\mathbb R),\,V\in M_{n,b}(\mathbb R)$ with orthonormal columns. Then $$ A+B=\underbrace{\pmatrix{U&V}\pmatrix{D\\ &\Lambda}}_X\ \underbrace{\pmatrix{U^T\\ V^T}}_Y=XY. $$ By Sylvester's secular theorem, $XY$ has the same multi-set of nonzero eigenvalues as $$ YX=\pmatrix{I_a&U^TV\\ V^TU&I_b}\pmatrix{D\\ &\Lambda}. $$ However, $(A+B)\oplus0$ is similar to $A\oplus B$, which in turn is similar to $D\oplus\Lambda\oplus0$. Therefore, the nonzero eigenvalues of $YX$ are precisely the eigenvalues of the nonsingular diagonal matrix $D\oplus\Lambda$. Since the two matrices have the same size, we in turn obtain $\det(YX)=\det(D\oplus\Lambda)$. It follows that $$ 1=\det\pmatrix{I_a&U^TV\\ V^TU&I_b} =\det(I_b-V^TUU^TV). $$ As $U$ and $V$ have orthonormal columns, $0\preceq V^TUU^TV\preceq V^T(I_n)V=I_b$. So, for the determinantal equality above to hold, we must have $V^TUU^TV=0$ . Therefore $U^TV=0$ and $AB=0$.

user1551
  • 149,263
0

This is only a modification and reinterpretation of user8675309’s answer for my own reference. Recall that given any real square matrix $M$, if we define $|M|=(M^TM)^{1/2}$, then $U|M|$ is a polar decomposition of $M$ if and only if $U$ is an orthogonal matrix that maximises $\operatorname{tr}(U^TM)$. Now return to our question. The given condition implies that $A\oplus B$ is similar to $(A+B)\oplus0$. Then \begin{align} &A\oplus B \sim (A+B)\oplus0\\ &\Rightarrow |A|\oplus |B| \sim |A+B|\oplus0\tag{1}\\ &\Rightarrow \operatorname{tr}|A|+\operatorname{tr}|B| = \operatorname{tr}|A+B|\\ &\Rightarrow |A|+|B|=|A+B| \quad\text{(by the trace characterisation of polar decomposition)}\\ &\Rightarrow |A|\oplus |B| \sim \big(|A|+|B|\big)\oplus0 \ \text{ by } (1)\ \text{(this reduces the problem to the PSD case)}\\ &\Rightarrow \operatorname{tr}\big(|A|^2+|B|^2\big)=\operatorname{tr}\left(\big(|A|+|B|\big)^2\right)\\ &\Rightarrow \operatorname{tr}\big(|A||B|\big)=0\\ &\Rightarrow |A||B|=0\\ &\Rightarrow AB=0.\\ \end{align}

user1551
  • 149,263