9

On Hacker News, someone posted the following exercise: prove that every nonsingular complex symmetric matrix $M$ has a symmetric matrix square root.

This is old chestnut. As the poster indicated, essentially every nonsingular Jordan block $J$ has a square root that is a polynomial in $J$. Therefore, using Hermite interpolation (we need Hermite rather than Lagrange because the Jordan form can have multiple and different-sized Jordan blocks for the same eigenvalue), one can show that $M$ has a square root in the form of $p(M)$ for some polynomial $p$.

However, this implies that the minimal polynomial $f(x)$ of $M$ must divide $p(x)^2-x$. In other words,

  • given any non-constant polynomial $f\in\mathbb C[x]$ such that $f(0)\neq0$, there exists a polynomial $g$ such that $f(x)g(x)+x$ is a perfect square.

I think this bullet statement somehow must have an elementary proof that is based solely on abstract algebra, but I have forgotten most of what I learnt from class. Any idea?

2 Answers2

6

It suffices to find a polynomial $h(x)$ such that $$h^2(x) \equiv x \bmod{f(x)}.$$ If we factor $f(x) = \prod_{i}(x - \lambda_i)^{a_i}$, then by the chinese remainder theorem, suffice to find $h_i(x)$ such that $$h_i(x)^2 \equiv x \bmod{(x - \lambda_i)^{a_i}}.$$ Now use $\lambda_i \neq 0$ and Hensel lifting to finish the proof. Essentially, first note that $h_i^1(x) = \sqrt{\lambda_i}$ satisfies the above for $a_i = 1$, then inductively construct the $h_i$ that satisfies the above for arbitrary $a_i$.

abacaba
  • 11,210
2

Interestingly, Maxime Bôcher (Introduction to Higher Algebra, 1907, pp.297-299) used exactly this result to prove that every nonsingular matrix has a square root. Although I rediscovered this result by considering the square roots of Jordan blocks, Jordan form is not needed in the proof of this result.

Bôcher's argument can be easily generalized to prove the following statement.

  • When $n$ is a positive integer, $f$ is a polynomial over an algebraically closed field of characteristic $0$ and $f(0)\ne0$, there exists a polynomial $g$ such that $fg+x$ is a perfect $n$-th power.

(Consequently, for every positive integer $n$, every nonsingular square matrix $A$ over an algebraically closed field of characteristic zero has a matrix $n$-th root that can be expressed as a polynomial in $A$. If $A$ is symmetric, its matrix $n$-th root can also be taken to be symmetric.)

Here is a sketch of his proof. Let $$f(x)=c\prod_{i=1}^m(x-\lambda_i)^{a_i}$$ where the $\lambda_i$s are distinct and nonzero (because $f(0)\ne0$). For each $i$, let $h_i(x)=c\prod_{j\neq i}(x-\lambda_j)^{a_j}$, the polynomial obtained from $f$ by omitting the factor $(x-\lambda_i)^{a_i}$. Bôcher's idea is to consider a polynomial of the form $$\chi=\sum_{i=1}^mq_ih_i$$ where $\deg q_i\leq a_i-1$. He wanted to show that the $q_i$s can be suitably chosen so that $\left(\chi(x)\right)^n-x$ is divisible by $f$.

Since $(x-\lambda_i)^{a_i}$ divides $h_j$ whenever $i\ne j$, we have $$\chi^n-x \equiv q_i^nh_i^n-x \mod (x - \lambda_i)^{a_i}.$$ So, if each $q_i$ is chosen such that each $q_i^nh_i^n-x$ is divisible by $(x - \lambda_i)^{a_i}$, then $\chi^n-x$ is divisible by each $(x - \lambda_i)^{a_i}$ and in turn also by $f$.

It remains to choose an appropriate $q_i$. Define $$p_i=q_i^nh_i^n-x.$$ Then $p_i$ is divisible by $(x - \lambda_i)^{a_i}$ if and only if $0=p(\lambda_i)=p'(\lambda_i)=\cdots=p^{(a_i-1)}(\lambda_i)$. If we take $$q_i(x)=\sum_{r=0}^{a_i-1}c_r(x-\lambda_i)^r,$$ the system of equations reduces to $$\begin{align}0=p(\lambda_i)&=c_0^nh_i(\lambda_i)^n+\lambda_i,\tag{1}\\ 0=p^{(r)}(\lambda_i)&=nr!c_0^{n-r}c_rh_i(\lambda_i)^n+d_r\quad(\text{for } 1\le r<a_i),\tag{2}\end{align}$$ where $d_r$ is some constant that depends only on $c_0,c_1,\ldots,c_{r-1}$ but not on $c_i$ for each $i\ge r$.

Since the underlying field is algebraically closed and $h_i(\lambda_i)\ne0$, we may take $c_0=-\lambda_i^{1/n}/h_i(\lambda_i)$ in equation $(1)$, where $\lambda_i^{1/n}$ is any $n$-th root of $\lambda_i$. As $\lambda_i$ is also nonzero, we have $nr!c_0^{n-r}h_i(\lambda_i)^n\ne0$ in equation $(2)$. Hence we may determine $c_1,c_2,\ldots,c_{a_i-1}$ from $(2)$ successively and obtain $q_i$.

(I guess the proof in Shengtong Zhang's answer is essentially the same, but he has used something called "Hensel lifting" that I have never heard of and so I don't really understand his answer.)