I'm reading "Linear Algebra" by Kenneth Hoffman and Ray Kunze.
I don't quite understand why there's a long proof in $\S$6.4 Theorem 6.
First the triangular matrix is defined:
An $n\times n$ matrix $A$ is called triangular if $A_{ij}=0$ whenever $i>j$ or if $A_{ij}=0$ whenever $i<j$.
Then defined triangulable:
The linear operator $T$ is called triangulable if there is an ordered basis in which $T$ is represented by a triangular matrix.
Then there's Theorem 5:
Let $V$ be a finite-dimensional vector space over the field $F$ and let $T$ be a linear operator on $V$. Then $T$ is triangulable if and only if the minimal polynomial for $T$ is a product of linear polynomials over $F$.
Now it comes Theorem 6:
Let $V$ be a finite-dimensional vector space over the field $F$ and let $T$ be a linear operator on $V$. Then $T$ is diagonalizable if and only if the minimal polynomial for $T$ has the form $p = (x - c_1) \dots (x - c_k)$ where $c_1, \dots , c_k$ are distinct elements of $F$.
The proof is: (the (1)(2).. numbers are added by me)
Proof
(1) We have noted earlier that, if $T$ is diagonalizable, its minimal polynomial is a product of distinct linear factors (see the discussion prior to Example 4).
(2)To prove the converse, let $W$ be the subspace spanned by all of the characteristic vectors of $T$, and suppose $W \ne V$. ....
What I don't understand is (2) -- why we need such a long proof (details are below) here?
Since Theorem 5 already proved that "minimal polynomial factors $p=(x-c_1)^{r_1}\dots(x-c_k)^{r_k}$, $c_i$ distinct $\Rightarrow$ $T$ is triangulable";
this part of Theorem 6 is "minimal polynomial factors $p=(x-c_1) \dots(x-c_k)$, $c_i$ distinct $\Rightarrow$ $T$ is triangulable",
so we just need to let all the $r_i$ be $1$, isn't it?
Proof details excerpted from Hoffman
(1) We have noted earlier that, if $T$ is diagonalizable, its minimal polynomial is a product of distinct linear factors (see the discussion prior to Example 4).
(2)To prove the converse, let $W$ be the subspace spanned by all of the characteristic vectors of $T$, and suppose $W \ne V$.
(3)By the lemma used in the proof of Theorem 5, there is a vector $\alpha$ not in $W$ and a characteristic value $c_j$ of $T$ such that the vector $\beta= (T - c_jI)\alpha$ lies in W.
(4)Since $\beta$ is in $W$, $\beta = \beta_1+\dots\beta_k$ where $T\beta_i = c_i\beta_i$, $1\le i\le k$, and therefore the vector $h(T)\beta = h(c_1)\beta_1+\dots+h(c_k)\beta_k$ is in $W$, for every polynomial $h$.
(5)Now $p = (x-c_j)q$, for some polynomial $q$.
(6)Also $q- q(c_j) = (x - c_j)h$.
(7)We have $q(T)\alpha - q(c_j)\alpha = h(T)(T - c_jI)\alpha = h(T)\beta$.
(8)But $h(T)\beta$ is in $W$ and, since $0 = p(T)\alpha = (T - c_jI)q(T)\alpha$, the vector $q(T)\alpha$ is in $W$.
(9)Therefore, $q(c_j)\alpha$ is in $W$.
(10)Since $\alpha$ is not in $W$, we have $q(c_j) = 0$.
(11)That contradicts the fact that $p$ has distinct roots. QED.