Below are two supposed proofs for the Cayley-Hamilton theorem. The first is a bogus proof; I would really appreciate comments as to whether my explanation as to why it is a bogus proof is correct.
The second proof is more of a standard Cayley-Hamilton proof. Again, comments regarding any mistakes would be appreciated. Thanks.
Theorem: Let $A$ be a square matrix over a commutative ring, then $A$ satisfies its own characteristic polynomial.
Bogus Proof
$p_A(\lambda)=\det(\lambda I-A)$ and substituting $A$ for $\lambda$, $p_A(A)=\det(AI-A)=\det(A-A)=\det(0)=0$.
Any proof that substitutes $A$ for $\lambda$ in $p_A(\lambda)$ is incorrect. $\lambda I-A$ is a polynomial matrix with entries of polynomials in the variable $\lambda$. $\lambda I-A$ therefore takes its entries from a polynomial ring $\mathbb F[\lambda]$, where $\mathbb F$ is the field of coefficients and $\lambda$ is the fixed symbol in polynomials in $\mathbb F[\lambda]$. We therefore cannot substitute $A$ for $\lambda$.
Proof:
Take the identity $\det(\lambda I_n-A)I_n=(\lambda I_n-A)=\operatorname{adj}(\lambda I_n-A)$
By definition $p_A(\lambda):=(\lambda I_n-A)$, therefore $p_A(\lambda)I_n=(\lambda I_n-A)\operatorname{adj}(\lambda I_n-A)\;$
The LHS of $*$ can be written as a linear combination of constant matrices. By definition, $p_A(\lambda)=\lambda^n+c_{n-1}\lambda^{n-1}+\cdots+c\lambda+c_0$, therefore $p_A(\lambda)I_n=\lambda^nI_n+c_{n-1}\lambda^{n-1}I_n+\cdots+c\lambda I_n+c_0I_n$
The RHS of $*$ can also be written as a linear combination of constant matrices. $\operatorname{adj}(\lambda I_n-A)$ is a polynomial matrix and can therefore be expressed as a linear combination of constant matrices. As the entries of $\operatorname{adj}(\lambda I_n-A)$ are the minors of the matrix $\lambda I_n-A$, the entries of $\operatorname{adj}(\lambda I_n-A)$ are polynomials of degree $n-1$ or less. Therefore:
$\operatorname{adj}(\lambda I_n -A)=\lambda^{n-1} B_{n-1}+\lambda^{n-2} B_{n-2}+\cdots+\lambda^1 B_1+\lambda^0 B_0=\displaystyle \sum_{i=0}^{n-1} \lambda^i B_i$
Using this to expand the RHS of $*$:
$(\lambda I_n -A)\displaystyle \sum_{i=0}^{n-1} \lambda^i B_i=\displaystyle \sum_{i=0}^{n-1} \lambda I_n \lambda^i B_i-\displaystyle \sum_{i=0}^{n-1} A\lambda^i B_i=\displaystyle \sum_{i=0}^{n-1} \lambda^{i+1} B_i-\displaystyle \sum_{i=0}^{n-1} \lambda^i AB_i=\displaystyle \sum_{i=0}^{n-1} (\lambda^{i+1} B_i-\lambda^i AB_i)$
$=\lambda^n B_{n-1}+\displaystyle \sum_{i=1}^{n-1} \lambda^i (B_{i-1}-AB_i)-AB_0$
Now for $*$ both sides are polynomials (linear combinations of constant matrices with $\lambda^i$ as variables). When two polynomials are equal their coefficients are equal; equating coefficients of $\lambda^i$:
\begin{aligned}\lambda^n:&\;&I_n&=B_{n-1}\\\\\lambda^{n-1}:&\;&\;c_{n-1}I_n&=B_{n-2}-AB_{n-1}\\\\\vdots&\;&\;&\;\;\vdots\\\\ \lambda^1:&\;&c_1I_n&=B_0-AB_1\\\\ \lambda^0:&\;&\;\;c_0I_n&=-AB_0 \end{aligned}
As the coefficients are equal, the sum of the LHS of coefficients is equal to the sum of the RHS:
$I_n+c_{n-1}I_n+\ldots+c_1I_n +c_0I_n=B_{n-1}+B_{n-2}-AB_{n-1}+\ldots+B_0-AB_1-AB_0$
Multiplying both sides by $A^i$ the equality holds:
$A^n+c_{n-1}A^{n-1}+\ldots+c_1A+c_0I_n=A^nB_{n-1}+A^{n-1}B_{n-2}-A^nB_{n-1}+\ldots+AB_0-A^2B_1-AB_0$
The LHS of $**$ is the characteristic polynomial of $A$, $p_A(A)$, and the RHS of $**$ is a telescoping sum that equals the zero matrix. Therefore, $A$ satisfies its own characteristic polynomial.
Q.E.D.