I am very confused how they are making the jump to considering Jordan blocks instead of the entire $J$ matrix.
The Jordan normal form is block diagonal, hence so is every power of it, and a block diagonal matrix is equal to the identity iff every block is equal to the identity. So you can work one block at a time.
However, this is vastly overkill. Jordan normal form is a difficult theorem and a much simpler argument suffices here. People really overrely on Jordan normal form when it's not necessary.
Theorem: Let $T : V \to V$ be a linear map on a vector space over a field $K$ satisfying $f(T) = 0$ where $f(t) \in K[t]$ is a polynomial which splits completely and is separable over $K$. Then $T$ is diagonalizable; more precisely, $V$ is a finite direct sum $\bigoplus_{\lambda} V_{\lambda}$ where $\lambda$ ranges over all roots $f(\lambda) = 0$ of $f$ and $V_{\lambda} = \ker(T - \lambda)$ is the $\lambda$-eigenspace of $T$.
Note that we don't need to assume $K$ is algebraically closed or that $V$ is finite-dimensional, which actually makes this theorem not a special case of the JNF argument in two different ways. For example it continues to apply to representations of finite groups $G$ on infinite-dimensional vector spaces over fields $K$ such that all $|G|$-th roots of unity exist. To apply the conclusion here we take $K = \mathbb{C}$ and $f(t) = t^k - 1$. More generally:
Corollary: Let $T : V \to V$ be a linear map on a finite-dimensional vector space over a field $K$ whose characteristic polynomial or minimal polynomial splits completely and is separable over $K$. Then $T$ is diagonalizable.
Proof 1. Here is an abstract argument. Write $f(t) = \prod_{\lambda} (t - \lambda)$. Since $f(T) = 0$, the action of $T$ on $V$ induces a $K[t]$-module structure factoring through the quotient
$$K[t]/f(t) \cong \prod_{\lambda} K_{\lambda}$$
where this isomorphism is using the Chinese remainder theorem and the factor $K_{\lambda}$ is just a copy of $K$ but it's the copy of $K$ that corresponds to $t$ acting by $\lambda$; that is, the image of $t$ in the above isomorphism is the tuple with components $\lambda \in K_{\lambda}$. Write $e_{\lambda}(t)$ for the idempotent element of $K[t]/f(t)$ whose component in $K_{\lambda}$ is $1$ but all of whose other components are zero, so that $t = \sum \lambda e_{\lambda}(t)$ and $t e_{\lambda}(t) = \lambda e_{\lambda}(t)$. Then $\sum e_{\lambda}(t) = 1$, so we can write every vector in $V$ as a sum
$$v = \sum_{\lambda} e_{\lambda}(T) v$$
where $T e_{\lambda}(T) v = \lambda e_{\lambda}(T) v$, so $e_{\lambda}(T) v$ is an eigenvector with eigenvalue $\lambda$. Since $e_{\lambda}(T)$ is idempotent this means $V_{\lambda} = \text{im}(e_{\lambda}(T))$. The fact that this is a direct sum follows from the fact that $e_{\lambda} e_{\lambda'} = 0$ if $\lambda \neq \lambda'$ or from the usual observation that eigenvectors corresponding to different eigenvalues are linearly independent (in fact this construction gives an independent proof that eigenvectors with different eigenvalues are linearly independent). $\Box$
Proof 2. Here is a more explicit version of the above argument which actually constructs the idempotents $e_{\lambda}$. We start with the observation that $f(T) = 0$ means $\prod_{\lambda} (T - \lambda) v = 0$ for all $v \in V$. For any $\lambda$ we can write this identity as
$$(T - \lambda) \frac{f(T)}{T - \lambda} v = 0$$
which shows that $\frac{f(T)}{T - \lambda} v$ is either $0$ or an eigenvector of $T$ with eigenvalue $\lambda$. This means $\frac{f(T)}{T - \lambda}$ is very close to being a projection onto $V_{\lambda}$, except that it might not be idempotent. But this turns out to be easy to fix. If we consider the special case that $v$ is itself an eigenvector $v_{\lambda}$ with eigenvalue $\lambda$ then
$$\frac{f(T)}{T - \lambda} v_{\lambda} = \prod_{\mu \neq \lambda} (T - \mu) v_{\lambda} = \prod_{\mu \neq \lambda} (\lambda - \mu) v_{\lambda} = f'(\lambda) v_{\lambda}$$
(the last equality is a general fact about polynomials and is a straightforward calculation with the formal derivative). Here $\mu$ runs over all the roots of $f$ which are not $\lambda$. This means we want to define
$$e_{\lambda}(T) = \frac{f(T)}{(T - \lambda) f'(\lambda)}.$$
Since we know $e_{\lambda}(T)$ produces eigenvectors with eigenvalue $\lambda$ the above calculation implies $e_{\lambda}(T)^2 v = e_{\lambda}(T) v$ for every $v \in V$, so $e_{\lambda}(T)$ is idempotent, has image contained in $V_{\lambda}$, and fixes $V_{\lambda}$, meaning it is a projection onto $V_{\lambda}$ as desired, and moreover its kernel contains $\bigoplus_{\mu \neq \lambda} V_{\mu}$. This is very promising. It remains to show that every vector $v \in V$ can be written as a sum of eigenvectors, and this sum must take the form $v = \sum e_{\lambda}(T) v$. This means we need to prove that
$$I = \sum_{\lambda} e_{\lambda}(T) = \sum_{\lambda} \frac{f(T)}{(T - \lambda) f'(\lambda)}.$$
This follows from a general polynomial identity. The polynomial $g(t) = \sum_{\lambda} \frac{f(t)}{(t - \lambda) f'(\lambda)}$ has degree $\deg f - 1$ and satisfies $g(\lambda) = 1$ for every root $\lambda$ of $f$ (every term is divisible by $t - \lambda$ except one, and the term that isn't evaluates to $\frac{\prod_{\mu \neq \lambda} (\lambda - \mu)}{f'(\lambda)} = 1$). So $g(t) = 1$ identically, hence $g(T) = 1$. $\Box$
A small note on notation. $\frac{f(T)}{T - \lambda}$ may not look well-defined if you think of the quotient as happening in endomorphisms, since $T - \lambda$ is never invertible. This is shorthand for the value of the polynomial $\frac{f(t)}{t - \lambda} \in K[t]$ evaluated at $T$; since $t - \lambda$ divides $f(t)$ by hypothesis this is unproblematic.
These arguments may look complicated but they are a strict subset of the arguments you have to make in one form or another to prove JNF and ultimately the only technical ingredients are fundamental facts about polynomials. A generalization of this argument where we make no assumptions on the polynomial $f$ can be used to begin the proof of the existence of rational canonical form.