5

I'm having trouble understanding the following proof, which is taken from A Brief Introduction to Numerical Analysis by Eugene E. Tyrtyshnikov.

(Note: The polynomials in the following are complex)

Theorem 3.9.1 Consider a parametrized batch of polynomials $$p(x,t)=x^n+a_1(t)x^{n-1}+...+a_n(t),$$ where $a_1(t),...,a_n(t)\in C[\alpha,\beta]$. Then there exist functions $$x_1(t),...,x_n(t)\in C[\alpha,\beta]$$ such that $$p(x_i(t),t)=0\;\;\;\textit{for}\;\;\;\alpha\le t\le\beta,\qquad i=1,...,n.$$

[The author argues why establishing the existence of one such function is sufficient. I will comment on this part later.]

[The author states and proves the Arzelà–Ascoli Theorem, which will be used in the following. Note that he uses "uniformly continuous" to mean equicontinuous.]

Proof of Theorem 3.9.1. Build up on $[\alpha,\beta]$ a sequence of uniform grids $$\alpha=t_{0m}<t_{1m}<...<t_{mm}=\beta;\qquad t_{i+1,m}-t_{im}=\frac{\beta-\alpha}{m}.$$ Let $y_m(t)$ be a piecewise linear function with breaks at $t_{0m},t_{1m},...,t_{mm}$. Define the values at the nodes as follows.

Take a root $z_0$ of the polynomial $p(x,\alpha)$, and, for all $m$, set $$y_m(t_{0m})=z_{0m}\equiv z_0.$$ Further, let $z_{1m}$ be any of those roots of the polynomial $p(x,t_{1m})$ nearest to $z_{0m}$, and, by induction, let $z_{i+1,m}$ be any of the roots of the polynomial $p(x,t_{i+1,m})$ nearest to $z_{im}$. Set $$y_m(t_{im})=z_{im},\qquad i=1,...,m.$$ The uniform boundedness of the piecewise linear functions $y_m(t)$ is evident.

I devised the following reasoning with some help: For $i=1,...,n$, there are constants $A_i$, such that $|a_i(t)|\le A_i$ for all $t\in[\alpha,\beta]$, by the Extreme Value Theorem. Let $R\colon=1+\sum_{i=1}^nA_i$. Then for any $x\in\mathbb{C}$, such that $|x|\ge R$, we have \begin{align} |p(x,t)|\ge&|x|^n-\sum_{i=1}^n|a_i(t)||x|^{n-i}\\ \ge&|x|^n-|x|^{n-1}\sum_{i=1}^nA_i\\ \ge&R^{n-1}\left(|x|-\sum_{i=1}^nA_i\right)\\ \ge&R^{n-1}>0. \end{align} Thus, the roots of $p(x,t)$ are contained in the disk of radius $R$ centered at the origin. For $t\in[0,1],z_1,z_2\in\mathbb{C}$, $$|tz_1+(1-t)z_2|\le t|z_1|+(1-t)|z_2|\le\max\{|z_1|,|z_2|\}.$$ This shows that each $y_m(t)$ attains it's maximum absolute value at one of the nodes. The values at the nodes are roots of $p(x,t)$, hence $R$ uniformly bounds the $y_m(t)$.

The uniform continuity emanates from the inequality $$|z_{i+1,m}-z_{im}|\le|p(z_{im},t_{i+1,m})|^{\frac{1}{n}}=...$$

Why does this hold? This is the most enigmatic part of the proof for me. I assume this part uses that $|z_{i+1,m}-z_{im}|$ was minimized in the construction, but I don't see how it comes together at all.

$$...=|p(z_{im},t_{i+1,m})-p(z_{im},t_{im})|^{\frac{1}{n}}\le R\left(\max_{\substack{\alpha\le t_1,t_2\le\beta\\|t_1-t_2|\le\frac{\beta-\alpha}{m}}}\sum_{j=1}^n|a_j(t_1)-a_j(t_2)|\right)^{\frac{1}{n}},$$ where $R\ge1$ is the radius (not necessarily minimal) of a circle encompassing all the roots of all the polynomials $p(x,t)$ for $\alpha\le t\le\beta$.

The existence of such an $R$ has already been established above. The equality is obvious as $p(z_{im},t_{im})=0$ by construction. For any $z_{im}$, we have either $|z_{im}|<1$, in which case $|z_{im}|^k\le|z_{im}|<1\le R\le R^n$ for $k=1,...,n-1$, or $|z_{im}|\ge1$, in which case $|z_{im}|^k\le R^k\le R^n$ for $k=1,...,n-1$. Then, \begin{align} |p(z_{im},t_{i+1,m})-p(z_{im},t_{im})|^{1/n}&=\left|z_{im}^n+\sum_{j=1}^na_j(t_{i+1,m})z_{im}^{n-j}-z_{im}^n-\sum_{j=1}^na_j(t_{im})z_{im}^{n-j}\right|^{1/n}\\ &=\left|\sum_{j=1}^nz_{im}^{n-j}(a_j(t_{i+1,m})-a_j(t_{im}))\right|^{1/n}\\ &\le\left(\sum_{j=1}^n|z_{im}|^{n-j}|a_j(t_{i+1,m})-a_j(t_{im})|\right)^{1/n}\\ &\le R\left(\sum_{j=1}^n|a_j(t_{i+1,m})-a_j(t_{im})|\right)^{1/n}\\ &\le R\left(\max_{\substack{\alpha\le t_1,t_2\le\beta\\|t_1-t_2|\le\frac{\beta-\alpha}{m}}}\sum_{j=1}^n|a_j(t_1)-a_j(t_2)|\right)^{1/n}. \end{align} The last inequality follows since $\alpha\le t_{i+1,m},t_{im}\le\beta$ and $|t_{i+1,m}-t_{im}|=\frac{\beta-\alpha}{m}$. It remains to see why that maximum actually exists:

Let $D\colon=\left\{(t_1,t_2)\in[\alpha,\beta]^2\mid|t_1-t_2|\le\frac{\beta-\alpha}{m}\right\}$. This set is clearly bounded. Now if $((t_{1n},t_{2n}))_n$ is a sequence in $D$ such that $(t_{1n},t_{2n})\rightarrow(t_1,t_2)$, then taking limits in $|t_{1n}-t_{2n}|\le\frac{\beta-\alpha}{m}$ yields $|t_1-t_2|\le\frac{\beta-\alpha}{m}$, so $(t_1,t_2)\in D$. Thus, $D$ is closed and, by Heine-Borel, compact. The function $f\colon D\rightarrow\mathbb{R}$ with $f(t_1,t_2)=\sum_{j=1}^n|a_j(t_1)-a_j(t_2)|$ is continuous since $a_j(t)$ is continuous for $j=1,...,n$. Thus, by the Extreme Value Theorem, $f$ attains a maximum on $D$.

Even then, however, I do not see how the above inequality implies equicontinuity. Demonstrating equicontinuity would require bounding $|y_m(z_1)-y_m(z_2)|$ for all $m\in\mathbb{N}$ in terms of $|z_1-z_2|$. Now, if $t_{im}\le z_1,z_2\le t_{i+1,m}$, then $|y_m(z_1)-y_m(z_2)|\le|z_{im}-z_{i+1,m}|$. However, one cannot impose such conditions on $z_1,z_2$, especially since the nodes get arbitrarily close for arbitrarily large $m$. So how does one use the above inequality to prove equicontinuity?

Using the Arzela-Ascoli theorem we find a uniformly convergent subsequence. Take into account that the limit of a uniformly convergent sequence of continuous functions on $[\alpha,\beta]$ must be a continuous function. The only thing left to check is that the limit function $y(t)$ satisfies $p(y(t),t)=0$ for all $\alpha\le t\le\beta$. That will do the proof.$\quad\square$

Let $(y_{m_k})_k$ be the uniformly convergent subsequence. Let $t\in[\alpha,\beta]$ be arbitrary and $t_{i_km_k}$ be the node closest to $t$ for each $k\in\mathbb{N}$. Then, by construction, we have $|t-t_{i_km_k}|\le\frac{\beta-\alpha}{m_k}\rightarrow0$, hence $t_{i_km_k}\rightarrow t$. Now, since $y_{m_k}\rightarrow y$ uniformly, $y_{m_k}(t_{i_km_k})\rightarrow y(t)$. Finally, from the continuity of the $a_i(t)$ and the standard results on sums and products of limits, we have $$0=p(z_{i_km_k},t_{i_km_k})=p(y_{m_k}(t_{i_km_k}),t_{i_km_k})\rightarrow p(y(t),t),\qquad\text{hence}\;\;p(y(t),t)=0.$$

Now the initial paragraph I alluded to earlier:

To begin the proof, note that it is sufficient to establish the existence of any single continuous function $x_n(t)$ such that $p(x_n(t),t)=0$ for $\alpha\le t\le\beta$. Should this be done, we write $$p(x,t)=(x-x_n(t))q(x,t),$$ where $q(x,t)=x^{n-1}+b_1(t)x^{n-2}+...+b_{n-1}(t)$. On the strength of the familiar algorithm for diving polynomials, $b_1(t),...,b_{n-1}(t)\in C[\alpha,\beta]$. So we may prove it by induction.

The application of polynomial division is clear since $x_n(t)$ is a root of $p(x,t)$ for any $t\in[\alpha,\beta]$. Comparing coefficients also yields that the coefficient of $x^{n-1}$ in $q(x,t)$ must be $1$. $p(x,t)$ is continuous in $t$ due to the continuity of the $a_i(t)$, so if $t_k\rightarrow t$, then $$(x-x_n(t_k))q(x,t_k)=p(x,t_k)\rightarrow p(x,t)=(x-x_n(t))q(x,t).$$ However, we already know $(x-x_n(t_k))\rightarrow(x-x_n(t))$ by the continuity of $x_n(t)$. Is this sufficient to conclude $q(x,t_k)\rightarrow q(x,t)$? (I believe it should be so, but the multiple variables and attempts to avoid an accidental division by zero make me wary.)

Furthermore, how may we conclude the continuity of the individual coefficients $b_i(t)$ for $i=1,...,n$ from the continuity of $q(x,t)$ in $t$ as a whole? Can this be done by using the linear independence of the monomials or something of the sort?

Finally, the statement of the Theorem only says there exist functions $x_1(t),...,x_n(t)$ which are roots of $p(x,t)$ for each $t\in[\alpha,\beta]$. However, this would trivially be satisfied by already choosing them all to be the same function. Does the inductive step not actually yield the stronger claim that $$p(x,t)=(x-x_1(t))\cdot...\cdot(x-x_n(t))\text{ for }t\in[\alpha,\beta],$$ which entails that, for each $t\in[\alpha,\beta]$, $x_1(t),...,x_n(t)$ are precisely all the $n$ roots of $p(x,t)$ counted with multiplicity?

Any answer to the specific questions I asked in the above, as well as any correction of potential mistakes I committed in my additions or any supplementation that the argument may still need will be appreciated. Thanks in advance.

Thorgott
  • 17,265
  • 1
    Do you need this proof? One would think that there were shorter and more direct ones. – Lubin Apr 26 '19 at 00:29
  • Though I am somewhat invested in this one by now, I do not need it in particular. The issue (and also the reason why this is the proof I'm consulting in the first place) is that all the other ones I found are based on the Implicit Function Theorem or Rouche's Theorem, neither of which I have learned yet. If you know another alternative, I'd be glad to hear it. – Thorgott Apr 26 '19 at 00:56
  • Well, actually, the only approach I know involves Implicit Function Th., and even that does not do the whole job for you. Oh well. – Lubin Apr 26 '19 at 01:01
  • https://math.stackexchange.com/questions/656858/on-continuity-of-roots-of-a-polynomial-depending-on-a-real-parameter/664473#664473 – Moishe Kohan Apr 29 '19 at 18:15
  • @MoisheKohan Thanks for your reply, but I'm afraid that answer uses machinery that is way beyond me. – Thorgott Apr 29 '19 at 18:35
  • If you want just a proof of continuity of the roots in the sense that for a monic polynomial $P$ of degree $n$ and any $\epsilon >0$ there is a $\delta >0$ (explicit in terms of the coefficients of $P$ and $\epsilon$) s.t. for all monic polynomials $Q$ of degree n for which each coefficient is at most $\delta$ from the respective coefficient of $P$, the roots of $Q$ can be numbered in such a way that each is at most $\epsilon$ from a root of $P$ (including multiplicities etc), then there is a fairly elementary such using only geometric series properties and some clever stuff – Conrad Apr 29 '19 at 22:35
  • @Conrad That sounds like a very good starting point. Do you know where I might find that proof? – Thorgott Apr 30 '19 at 10:57
  • 1
    It is in the superb monograph: Analytic Theory of Polynomials by Rahman and Schmeisser (page 10, Theorem 1.3.1), and the elementary lemma (used to prove Theorem 1.1.4, page 5, giving primitive bounds on roots in terms of coefficients) that is crucial, states that if a monic polynomial of degree $n$ has the first $m \ge 1$ coefficients less or equal than $1$ in absolute value (starting from the free term to the $m-1$ power one) then, its first $m$ roots (as ordered in increasing absolute value) are at most $2$ in absolute value. If needed I can definitely sketch the proof here – Conrad Apr 30 '19 at 11:05
  • The theorem simply follows from Rouché. – G.Kós May 01 '19 at 10:33

0 Answers0