Exam question about JNF and matrix diagonalization

Question

Q: Let $G$ be a finite group, $n>0$ a positive integer and let $\mathbb C$ be an algebraically closed field. Recall that $$\operatorname{GL}_{n}(\mathbb{C})=\left\{M \in \operatorname{Mat}_{n, n}(\mathbb{C}) \mid \operatorname{det}(M) \neq 0\right\}$$ is the group of invertible $n\times n$-matrices under matrix multiplication. Suppose that $\Phi:G\to GL_n(\mathbb C);g⟼M_g$ is a group homomorphism. So, if $g\in G$ then $M_g$ is an invertible $n\times n$ matrix and $M_{gh}=M_gM_h$, for all $g,h\in G$.

Let $g\in G$. Show that the matrix $M_g$ is diagonalisable for all $g\in G$.

Sol: Since $\mathbb C$ is algebraically closed we can put the matrix $M_g$ into Jordan form. If $J_m(\lambda)$ is a Jordon block for $M_g$ then $M_{g^k}$ is similar to the matrix $J_m(\lambda)^k$. Since $G$ is finite dimensional, $g^k=1$ for some $k>0$, so this means that $J_m(\lambda)^k=I_m$, which is possible if and only $m=1$. Hence, all of the Jordon blocks for $M_g$ are of size $1$ , so $M_g$ is diagonalisable.

Can someone please help explain their solution to this question? I am very confused how they are making the jump to considering Jordan blocks instead of the entire $J$ matrix. i.e., especially this part:

"this means that $J_m(\lambda)^k=I_m$, which is possible if and only $m=1$"

Here is my answer based on my understanding:

Since $\mathbb C$ is algebraically closed, $M_g$ can be put into its JNF form; that is, $M_g=PJP^{-1}$ for some transition matrix $P$. Then $M_{g^k}=M_g^k=PJ^kP^{-1}$.
Since $G$ is a finite group, $g^k=e$ for some integer $k≥1$, where $e$ is the identity element in $G$. Applying the isomorphism, $\Phi(g^k)=\Phi(e) \iff M_{g^k}=I_n$.
Hence, $I_m=M_{g^k}=PJ^kP^{-1} \implies J^k=I_n$. Since the Jordan matrix is block diagonal with Jordan blocks, $J^k$ means raising each jordan block $J_{n_i}(\alpha_i)$ to the power $k$.
- WLOG, fix and consider a simple block $J_2(\lambda) = \begin{pmatrix} \lambda & 1 \\ 0 & \lambda \end{pmatrix}$. Then $J_2(\lambda)^k = \begin{pmatrix} \lambda^k & k\lambda^{k-1} \\ 0 & \lambda^k \end{pmatrix}$.
- In order for $J^k=I_n$ to be true, we require that $\lambda ^k=1$ and $k\lambda ^{k-1}=0$. However, because $k \neq 0$ and $\lambda \neq 0$, this implies that having any off-diagonal elements (which are 1's in Jordan blocks) is impossible.
- In particular, each Jordan block must be a $1\times 1$ matrix, so $M_g$ is diagonalizable.

Is this solution valid? Have I missed some key details and overcomplicated the question?

It's strange to write "let $\Bbb C$ be an algebraically closed field". Typically, $\Bbb C$ refers exclusively to the complex numbers. More troubling, the statement fails to be true if $\Bbb C$ is an algebraically closed field of non-zero characteristic. — Ben Grossmann, Jun 27 '24 at 15:17
Your argument is completely fine if $\Bbb C$ refers specifically to the complex numbers, as I suspect it does. There is a quicker argument, however: the fact that $M_g^k = I$ tells you something about the minimal polynomial of $M_g$, and there is a condition on the minimal polynomial that guarantees its diagonalizability. Perhaps you can put the pieces together from there. — Ben Grossmann, Jun 27 '24 at 15:22
the more common approach for this is to use a custom inner product (Weyl Trick) to show all matrices are (simultaneously) similar to unitary ones; the result then follows by spectral theorem — user8675309, Jun 27 '24 at 17:00
@user8675309 (nice name): the spectral theorem is also overkill here. The result holds over any field $K$ of characteristic not dividing $|G|$ and containing all $|G|$-th roots of unity. You really only need to know that $g^k = 1$. — Qiaochu Yuan, Jun 27 '24 at 18:47
@QiaochuYuan nevertheless, complex finite group representation intros do tend to follow the arc I indicated; I seem to think you learned it this way in Artin's Algebra for instance. — user8675309, Jun 27 '24 at 19:58
From the [tag:solution-verification] tag wiki: "A question with this tag should include an explanation for why the argument presented is not convincing enough." — Shaun, Jun 29 '24 at 13:09

Qiaochu Yuan · Accepted Answer · 2024-06-30T14:29:52.260

I am very confused how they are making the jump to considering Jordan blocks instead of the entire $J$ matrix.

The Jordan normal form is block diagonal, hence so is every power of it, and a block diagonal matrix is equal to the identity iff every block is equal to the identity. So you can work one block at a time.

However, this is vastly overkill. Jordan normal form is a difficult theorem and a much simpler argument suffices here. People really overrely on Jordan normal form when it's not necessary.

Theorem: Let $T : V \to V$ be a linear map on a vector space over a field $K$ satisfying $f(T) = 0$ where $f(t) \in K[t]$ is a polynomial which splits completely and is separable over $K$. Then $T$ is diagonalizable; more precisely, $V$ is a finite direct sum $\bigoplus_{\lambda} V_{\lambda}$ where $\lambda$ ranges over all roots $f(\lambda) = 0$ of $f$ and $V_{\lambda} = \ker(T - \lambda)$ is the $\lambda$-eigenspace of $T$.

Note that we don't need to assume $K$ is algebraically closed or that $V$ is finite-dimensional, which actually makes this theorem not a special case of the JNF argument in two different ways. For example it continues to apply to representations of finite groups $G$ on infinite-dimensional vector spaces over fields $K$ such that all $|G|$-th roots of unity exist. To apply the conclusion here we take $K = \mathbb{C}$ and $f(t) = t^k - 1$. More generally:

Corollary: Let $T : V \to V$ be a linear map on a finite-dimensional vector space over a field $K$ whose characteristic polynomial or minimal polynomial splits completely and is separable over $K$. Then $T$ is diagonalizable.

Proof 1. Here is an abstract argument. Write $f(t) = \prod_{\lambda} (t - \lambda)$. Since $f(T) = 0$, the action of $T$ on $V$ induces a $K[t]$-module structure factoring through the quotient

$$K[t]/f(t) \cong \prod_{\lambda} K_{\lambda}$$

where this isomorphism is using the Chinese remainder theorem and the factor $K_{\lambda}$ is just a copy of $K$ but it's the copy of $K$ that corresponds to $t$ acting by $\lambda$; that is, the image of $t$ in the above isomorphism is the tuple with components $\lambda \in K_{\lambda}$. Write $e_{\lambda}(t)$ for the idempotent element of $K[t]/f(t)$ whose component in $K_{\lambda}$ is $1$ but all of whose other components are zero, so that $t = \sum \lambda e_{\lambda}(t)$ and $t e_{\lambda}(t) = \lambda e_{\lambda}(t)$. Then $\sum e_{\lambda}(t) = 1$, so we can write every vector in $V$ as a sum

$$v = \sum_{\lambda} e_{\lambda}(T) v$$

where $T e_{\lambda}(T) v = \lambda e_{\lambda}(T) v$, so $e_{\lambda}(T) v$ is an eigenvector with eigenvalue $\lambda$. Since $e_{\lambda}(T)$ is idempotent this means $V_{\lambda} = \text{im}(e_{\lambda}(T))$. The fact that this is a direct sum follows from the fact that $e_{\lambda} e_{\lambda'} = 0$ if $\lambda \neq \lambda'$ or from the usual observation that eigenvectors corresponding to different eigenvalues are linearly independent (in fact this construction gives an independent proof that eigenvectors with different eigenvalues are linearly independent). $\Box$

Proof 2. Here is a more explicit version of the above argument which actually constructs the idempotents $e_{\lambda}$. We start with the observation that $f(T) = 0$ means $\prod_{\lambda} (T - \lambda) v = 0$ for all $v \in V$. For any $\lambda$ we can write this identity as

$$(T - \lambda) \frac{f(T)}{T - \lambda} v = 0$$

which shows that $\frac{f(T)}{T - \lambda} v$ is either $0$ or an eigenvector of $T$ with eigenvalue $\lambda$. This means $\frac{f(T)}{T - \lambda}$ is very close to being a projection onto $V_{\lambda}$, except that it might not be idempotent. But this turns out to be easy to fix. If we consider the special case that $v$ is itself an eigenvector $v_{\lambda}$ with eigenvalue $\lambda$ then

$$\frac{f(T)}{T - \lambda} v_{\lambda} = \prod_{\mu \neq \lambda} (T - \mu) v_{\lambda} = \prod_{\mu \neq \lambda} (\lambda - \mu) v_{\lambda} = f'(\lambda) v_{\lambda}$$

(the last equality is a general fact about polynomials and is a straightforward calculation with the formal derivative). Here $\mu$ runs over all the roots of $f$ which are not $\lambda$. This means we want to define

$$e_{\lambda}(T) = \frac{f(T)}{(T - \lambda) f'(\lambda)}.$$

Since we know $e_{\lambda}(T)$ produces eigenvectors with eigenvalue $\lambda$ the above calculation implies $e_{\lambda}(T)^2 v = e_{\lambda}(T) v$ for every $v \in V$, so $e_{\lambda}(T)$ is idempotent, has image contained in $V_{\lambda}$, and fixes $V_{\lambda}$, meaning it is a projection onto $V_{\lambda}$ as desired, and moreover its kernel contains $\bigoplus_{\mu \neq \lambda} V_{\mu}$. This is very promising. It remains to show that every vector $v \in V$ can be written as a sum of eigenvectors, and this sum must take the form $v = \sum e_{\lambda}(T) v$. This means we need to prove that

$$I = \sum_{\lambda} e_{\lambda}(T) = \sum_{\lambda} \frac{f(T)}{(T - \lambda) f'(\lambda)}.$$

This follows from a general polynomial identity. The polynomial $g(t) = \sum_{\lambda} \frac{f(t)}{(t - \lambda) f'(\lambda)}$ has degree $\deg f - 1$ and satisfies $g(\lambda) = 1$ for every root $\lambda$ of $f$ (every term is divisible by $t - \lambda$ except one, and the term that isn't evaluates to $\frac{\prod_{\mu \neq \lambda} (\lambda - \mu)}{f'(\lambda)} = 1$). So $g(t) = 1$ identically, hence $g(T) = 1$. $\Box$

A small note on notation. $\frac{f(T)}{T - \lambda}$ may not look well-defined if you think of the quotient as happening in endomorphisms, since $T - \lambda$ is never invertible. This is shorthand for the value of the polynomial $\frac{f(t)}{t - \lambda} \in K[t]$ evaluated at $T$; since $t - \lambda$ divides $f(t)$ by hypothesis this is unproblematic.

These arguments may look complicated but they are a strict subset of the arguments you have to make in one form or another to prove JNF and ultimately the only technical ingredients are fundamental facts about polynomials. A generalization of this argument where we make no assumptions on the polynomial $f$ can be used to begin the proof of the existence of rational canonical form.

Exam question about JNF and matrix diagonalization

1 Answers1

Linked