Commuting real matrices

Question

Given three real $n\times n$ matrices $X,Y,Z$ satisfying the following conditions: $$XZ=ZX$$ $$YZ=ZY$$ $$\mathrm{rank}(XY-YX+I)=1,$$ prove that $Z=aI$ for some real number $a$.

One possible solution to this problem has been sketched in the comments years ago. The idea is that, up to conjugation by a complex invertible matrix, we can assume $Z$ to be a Jordan matrix. This forces the matrices $X,Y$ to have a precise block structure and the remaining part of the proof should be achieved after some rather long computation.

I would like to know if the problem can be solved in an easier/faster way.

@FormulaWriter it is fairly easy to show, without use of Jordan forms, that can only have a single eigenvalue, a. Showing that eigenvalue to be semi-simple... seems to require getting in the weeds of commuting triangular structures. — user8675309, Jul 27 '20 at 03:56
@user8675309 Interesting. May I ask you how to show $Z$ has only one eigenvalue? — FormulaWriter, Jul 27 '20 at 09:13
@FormulaWriter computing the trace shows that the rank one matrix is diagonalizable, so use similarity transforms and assume WLOG it is diagonal. $C:=XY-YX$ so $ZC = CZ = Z( n\mathbf e_1\mathbf e_1^T -I) = ( n\mathbf e_k\mathbf e_k^T-I)Z $, $\longrightarrow Z$ is block diagonal. Now $\text{trace}\big( Z^kC\big) = 0$ for all powers of $k$ which via the blocked structure shows all eigenvalues of $Z$ are equal to $a$. Equivalently, consider $Z' =Z-aI$ so $Z'$ has a zero in top left corner and $\text{trace}\big( (Z')^kC\big) = 0\longrightarrow$ $Z'$ is nilpotent. — user8675309, Jul 27 '20 at 17:08
@user8675309 May I ask you to expand the details in a (partial) answer? — FormulaWriter, Jul 27 '20 at 20:18

score 1 · Answer 1 · edited Feb 05 '24 at 12:20

Here's a partial proof that addresses algebraic multiplicities of eigenvalues.
All matrices are $n \times n$ unless indicated otherwise.

Leg one: $Z$ has a single eigenvalue $\sigma$ with algebraic multiplicity $n$.
$\mathrm{rank}(XY-YX+I)=1\Longrightarrow XY-YX+I = \mathbf a\mathbf b^T$.
Taking the trace of each side, we get
$$\text{trace}\big(XY\big)-\text{trace}\big(YX\big)+\text{trace}\big(I_n\big) = n = \text{trace}\big(\mathbf a\mathbf b^T\big).$$ Thus $\mathbf {ab}^T$ is rank one with trace $n$ so it may be diagonalized by some matrix $S$. If $\mathbf e_k$ denotes the $k$th standard basis vector, we have

$$S^{-1}\big(XY-YX+I\big)S = (S^{-1}XS)(S^{-1}YS)-(S^{-1}YS)(S^{-1}XS)+I = S^{-1}\mathbf a\mathbf b^TS = n \mathbf e_1\mathbf e_1^T.$$

At this point, we could formally do a change of variables and define
$X':=(S^{-1}XS)$, $Y':=(S^{-1}YS)$, $Z':=(S^{-1}ZS)$.
Conjugation doesn't change commuting but for notational simplicity we instead proceed by assuming WLOG that
$XY-YX =n \mathbf e_1\mathbf e_1^T- I$.

Since $Z$ commutes with the $LHS$ it commutes with the $RHS$, and
$Z\big(n \mathbf e_1\mathbf e_1^T- I\big) = \big(n \mathbf e_1\mathbf e_1^T- I\big)Z\Longrightarrow Z= \left[\begin{matrix}\sigma &\mathbf 0^T\\\mathbf 0 &Z_{n-1}\end{matrix}\right].$

From here
$Z'' := Z -\sigma I$
(which preserves commuting with $X$ and $Y$).

$(Z'')^k\big(XY-YX\big)= \left[\begin{matrix}0 &\mathbf 0^T\\\mathbf 0 &(Z_{n-1}-\sigma I_{n-1})^k\end{matrix}\right]\left[\begin{matrix}n-1 &\mathbf 0^T\\\mathbf 0 &-I_{n-1}\end{matrix}\right] = \left[\begin{matrix}0 &\mathbf 0^T\\\mathbf 0 &-(Z_{n-1}-\sigma I_{n-1})^k\end{matrix}\right]$
Taking the trace of each side, we have
$-\text{trace}\Big((Z_{n-1}-\sigma I_{n-1})^k\Big) $
$= \text{trace}\Big((Z'')^k\big(XY-YX\big)\Big) $
$= \text{trace}\Big((Z'')^kXY\Big)-\text{trace}\Big((Z'')^kYX\Big) $
$= \text{trace}\Big((Z'')^kXY\Big)-\text{trace}\Big(X(Z'')^kY\Big) $
$= \text{trace}\Big((Z'')^kXY\Big)-\text{trace}\Big((Z'')^k XY\Big) $
$=0$.

Thus $Z_{n-1}-\sigma I_{n-1}$ is nilpotent, i.e., $Z$ has eigenvalue $\sigma$ with algebraic multiplicity $n$.

Thank you for leg one :) I don't know why these two statements hold:

$ab_T$ is rank one with trace $n$ so it may be diagonalized

...thus $(Z_{n-1}-\sigma I_{n-1})$ is nilpotent.

If you can show me I'll thank you again, meanwhile I'll try to figure out. — FormulaWriter, Jul 29 '20 at 21:31
hints: 1.) a rank one matrix has eigenvalue zero with geo multiplicity $n-1$ (by rank nullity) and geo mult $\leq$ algebraic mult $\leq$ n; if algebraic mult is n, what does that imply about the trace? for 2.) https://math.stackexchange.com/questions/159167/traces-of-all-positive-powers-of-a-matrix-are-zero-implies-it-is-nilpotent/159201 ... also the geometric series argument here: https://math.stackexchange.com/questions/3623345/products-of-matrices-in-either-order-have-the-same-characteristic-polynomial/3624060#3624060 since nilpotent matrices must have zero trace for all powers k — user8675309, Jul 29 '20 at 21:59

score 1 · Answer 2 · answered Sep 06 '20 at 20:03

Here is a solution that uses Jordan normal form. First, note that $XY-YX+I$ cannot be rank-one when, via the same similarity transform, $$ X\sim\pmatrix{U_1&U_2\\ 0&U_3},\ Y\sim\pmatrix{V_1&V_2\\ 0&V_3}\tag{1} $$ where $U_1,V_1$ are $r\times r$ for some $0<r<n$ and $U_3,V_3$ are $(n-r)\times(n-r)$. This is because $(1)$ implies that $$ XY-YX+I\sim\pmatrix{U_1V_1-V_1U_1+I_r&\ast\\ 0&U_3V_3-V_3U_3+I_{n-r}}, $$ but the ranks of both $U_1V_1-V_1U_1+I_r$ and $U_3V_3-V_3U_3 +I_{n-r}$ are at least $1$ (because the two matrices have nonzero traces), so that $$ \operatorname{rank}(XY-YX+I_n)\ge\operatorname{rank}(U_1V_1-V_1U_1+I_r)+\operatorname{rank}(U_3V_3-V_3U_3 +I_{n-r})\ge2. $$

Now, if $Z$ has $k>1$ different eigenvalues $\lambda_1,\lambda_2,\ldots,\lambda_k$, by a change of basis over $\mathbb C$, we may assume that $Z$ is in Jordan form $Z=Z_1\oplus\cdots\oplus Z_k$, where each submatrix $Z_i$ is a Jordan form for the eigenvalue $\lambda_i$. Since $X$ and $Y$ commute with $Z$, they must assume block-diagonal forms $X=X_1\oplus\cdots\oplus X_k$ and $Y=Y_1\oplus\cdots\oplus Y_k$ where both $X_i$ and $Y_i$ have the same sizes as $Z_i$. But then $X$ and $Y$ will be in the form of $(1)$, which is a contradiction to the assumption that $\operatorname{rank}(XY-YX+I_n)$ has rank $1$.

Therefore $Z$ must possess a single eigenvalue $\lambda$ of multiplicity $n$. We may assume that $Z=J_{r_1}\oplus\cdots\oplus J_{r_m}$, where each $J_{r_i}$ denotes a Jordan block of size $r_i$ for the eigenvalue $\lambda$, with $1\le r_1\le\cdots\le r_m$ and $r_1+\cdots+r_m=n$.

Every matrix $B$ that commutes with such a Jordan form $Z$ can be partitioned into a block matrix form $(B_{ij})$, where each sub-block $B_{ij}$ is of the form $$ \mathbb R^{r_i\times r_j}\ni B_{ij}= \begin{cases} T_{ij}&\text{if }r_i=r_j\\ \pmatrix{T_{ij}\\ 0}&\text{if }r_i>r_j\\ \pmatrix{0&T_{ij}}&\text{if }r_i<r_j\\ \end{cases}\tag{2} $$ and $T_{ij}\in\mathbb R^{\min(r_i,r_j)\times\min(r_i,r_j)}$ is an upper triangular (square) Toeplitz matrix. Since each $B_{ij}$ is "upper triangular", if we put $\mathcal I=\{1,\,1+r_1,\,1+r_1+r_2,\,\ldots,\,1+r_1+r_2+\cdots+r_{m-1}\}$ (i.e. each element of $\mathcal I$ is the row/column index of the top-left element of some sub-block $B_{ij}$ in $B$) and $\mathcal J$ be the complement of $\mathcal I$ in $\{1,2,\ldots,n\}$, then by $(2)$, $B([\mathcal I,\mathcal J],[\mathcal I,\mathcal J])$ will be in the form of $\pmatrix{W_1&W_2\\ 0&W_3}$, where $W_1$ is $m\times m$. Now, if $Z$ has any non-trivial Jordan block, then $m<n$ and hence both $W_1$ and $W_3$ are non-empty. This is true in particular for $X$ and $Y$. But then $(1)$ is true and we arrive at a contradiction. Hence $Z$ cannot have any non-trivial Jordan block, i.e. $Z$ must be a scalar matrix.

score 1 · Answer 3 · answered Sep 17 '24 at 04:34

This is an easy Jordan-free proof that the OP seemed to be looking for. The original question was about scalars in $\mathbb R$ though this proof holds for any field $\mathbb F$ where $\text{char }\mathbb F = 0$ or $\text{char }\mathbb F \gt n$.

$\mathrm{rank}\big(XY-YX+I\big)=1$ and $\text{trace}\big(XY-YX+I\big) = n$
i.e. the commutator $C:=XY-YX$ has a $1$-dimensional $C$-invariant subspace given by $\text{image }\big(C+I\big)$ and an $n-1$ dimensional $C$-invariant subspace given by $\ker \big(C+I\big)$. This ultimately means $S^{-1}CS=\left[\begin{matrix}n-1 &\mathbf 0^T\\\mathbf 0 &-I_{n-1}\end{matrix}\right]$ for some $S\in GL_n\big(\mathbb F\big)$.

$Z$ commutes with $C$ hence $\text{image }\big(C+I\big)$ is a $1$-dimensional $Z-$invariant subspace, i.e. $\mathbf w \in \text{image }\big(C+I\big)\implies Z\mathbf w = \lambda \cdot \mathbf w$, so $Z$ has an eigenvalue $\lambda \in \mathbb F$. And since $Z$ commutes with $X,Y$ and $C$ we know $W:=\ker\big(Z-\lambda I\big)$ is an invariant subspace for those matrices, giving us the restrictions $X_{\vert W}$, $Y_{\vert W}$ and $C_{\vert W} = X_{\vert W}Y_{\vert W} - Y_{\vert W}X_{\vert W}$. Letting $m_1$ and $m_2$ be the respective algebraic multiplicities of the (possible) eigenvalues $n-1$ and $-1$ of $ C_{\vert W}$, we have the following system of equations

$\left[\begin{matrix}1 &1\\n-1 & -1\end{matrix}\right]\left[\begin{matrix}m_1 \\ m_2\end{matrix}\right]=\left[\begin{matrix}\dim W \\ \text{trace}\big(C_{\vert W}\big)\end{matrix}\right]=\left[\begin{matrix}\dim W \\ 0\end{matrix}\right]\implies \dim W=n$

as $m_1 \in \big\{0,1\big\}\text{ and }(n-1)\cdot m_1 = m_2\implies m_1 \neq 0\text{ since }m_1 + m_2 \neq 0$ $\implies m_1 = 1\implies m_2=n-1 \implies n = m_1 + m_2 = \dim W$. Conclude $\text{rank }\big(Z-\lambda I\big) = 0\implies Z = \lambda I $ which completes the proof.

$\Big($When $\text{char }\mathbb F= 0$ there's a somewhat nicer alternative finish: $\left[\begin{matrix}1 &1\\n-1 & -1\end{matrix}\right]^{-1}=\left[\begin{matrix}\frac{1}{n} & \frac{1}{n}\\1 - \frac{1}{n} & - \frac{1}{n}\end{matrix}\right]$ which implies $\frac{\dim W}{n}=m_1 \in \mathbb N$ and $1\leq \dim W\leq n$ gives the result$\Big) $

score 0 · Answer 4 · answered Feb 27 '17 at 12:30

0

$XY=YX$ implies $XY-YX=0$ we deduce $XY-YX+I=I$ so $X,Y,Z$ are one dimensional matrices since the rank of $I$ is $1$ and $Z=aI$.

answered Feb 27 '17 at 12:30

Tsemo Aristide

89,587

Sorry, my mistake, I've just fixed it – qan Feb 27 '17 at 14:04
@Tsemo Aristide: How do you find that $XY = YX$? – Student Feb 27 '17 at 16:26
@Tsemo Aristide: as a counterexample to show that $I$ does not need to have rank 1: Consider $Z$ to be the 2 by 2 identity matrix, $X = \begin{pmatrix} 1 & 1\0 & 1\end{pmatrix}$ and $Y = \begin{pmatrix} 0 & 1\1 & 0 \end{pmatrix}$. Then we have that $XZ = ZX, YZ = YZ$ and the rank of $XY - YX + I$ is 1, but none of these matrices are one dimensional. – Student Feb 27 '17 at 16:31
1

@Student Tsemo Aristide answered the question as originally posed. The OP has made an edit that completely changes the problem. – Callus - Reinstate Monica Feb 27 '17 at 16:32
Oh okey, this explaines everything! I was really confused why we could find that $XY = YX$. – Student Feb 27 '17 at 16:33
@callus: I am not sure what to do with my comment now, should I remove it? – Student Feb 27 '17 at 16:35
@Student No, in fact, I think it's important to leave it. If someone else takes a look at this page, there's a good chance they will be in the same boat you were in. – Callus - Reinstate Monica Feb 27 '17 at 20:42

score 0 · Answer 5 · answered Sep 17 '24 at 17:13

After reading the various answers here (including, in particular, user8675309’s new answer), I find that the key is to look at the eigenspaces of $Z$. The generalised eigenspaces of $Z$ or the eigenspaces/generalised eigenspaces of $X,Y$ or $[X,Y]$ are not really useful.

Let us consider matrices in $M_n(\mathbb F)$ for some algebraically closed field $\mathbb F$ such that either $\operatorname{char}(\mathbb F)=0$ or $\operatorname{char}(\mathbb F)>n$. (In the OP’s case, just view $X,Y,Z$ as matrices in $M_n(\mathbb C)$.) Let $\lambda$ be any eigenvalue of $Z$ and $m=\dim\ker(Z-\lambda I_n)$ be its geometric multiplicity. Note that $\ker(Z-\lambda I_n)$ is an invariant subspace of both $X$ and $Y$, because $Z$ commutes with these two matrices.

We claim that $\ker(Z-\lambda I_n)=\mathbb F^n$. Suppose the contrary. Then $\ker(Z-\lambda I_n)$ is a non-trivial proper subspace of $\mathbb F^n$. Therefore, by a change of basis, we may assume that $$ X=\pmatrix{X_1&\ast\\ 0&X_2},\quad Y=\pmatrix{Y_1&\ast\\ 0&Y_2},\quad Z=\pmatrix{\lambda I_m&\ast\\ 0&Z_2} $$ and $$ [X,Y]+I_n=\pmatrix{[X_1,Y_1]+I_m&\ast\\ 0&[X_2,Y_2]+I_{n-m}}. $$ By assumption, $\operatorname{char}(\mathbb F)$ is either zero or greater than $n$. Hence $\operatorname{tr}([X_1,Y_1]+I_m)=m$ is nonzero. In turn, $[X_1,Y_1]+I_m$ must be nonzero and $\operatorname{rank}([X_1,Y_1]+I_m)\ge1$. Similarly, $\operatorname{rank}([X_2,Y_2]+I_{n-m})\ge1$. But then $$ 2\le \operatorname{rank}([X_1,Y_1]+I_m)+ \operatorname{rank}([X_2,Y_2]+I_{n-m}) \le\operatorname{rank}([X,Y]+I)=1, $$ which is a contradiction. Hence we must have $\ker(Z-\lambda I_n)=\mathbb F^n$ at the beginning, meaning that $Z=\lambda I_n$.

Commuting real matrices

5 Answers5