First thing first, this result is false for $p=2$. Indeed, $$(\mathbb Z_2, +)\simeq (1+4\mathbb Z_p,\times)\not\simeq (1+2\mathbb Z_2, \times)\simeq (1+4\mathbb Z_2)\times\{\pm 1\}$$
Deails can be found in e.g. Serre's Serre's A Course in Arithmetic.
It is very common in commutative algebra to write $a\equiv b\mod I$ when $I$ is an ideal of a commutative ring and $a-b\in I$. It is less common but not unheard of to use the notation in group theory, such as in this post. In van der Waerden's classic, he introduced the same notation for abelian groups at the end of Section 2.5 (also in Hungerford as OP mentioned in the comment).
I do remember seeing this notation in the context of $p$-adic multiplicative structures, but couldn't find any now. In your context, it made sense in both ways. The strategy is clear: First show that $x\mapsto(1+p)^x$ is a continuous group homomorphism, and its kernel is trivial. The hard part is to show it's also surjective. To do this, given any $1+pa$, if we can inductively build $x_n$ such that
$$(1+p)^{x_n}(1+pa)^{-1}\in 1+p^n\mathbb Z_p$$
Then since $\mathbb Z_p$ is compact, we can find a convergent subsequence $x_{n_m}\rightarrow x$, such that $$(1+p)^x(1+pa)^{-1}=\lim_{m\rightarrow \infty} (1+p)^{x_{n_m}}(1+pa)^{-1} = 1$$ $$\Longrightarrow (1+p)^x = 1+pa$$
But the same method essentially works if we can find $x_n$ such that $(1+p)^{x_n}-(1+pa)\in p^n\mathbb Z_p$.
Now we prove the result for $p\ge 3$. First, we prove a lemma (note that it is false when $p=2, n=1, x=2, m=2$):
$$\forall x\in \mathbb Z_p, n\ge 0, p^n\mid x \Rightarrow \forall m \ge 2, p^{n+2}\mid {x \choose m}p^m$$ $$\Downarrow$$ $$(1+p)^x\equiv 1 + px \mod p^{\nu_p(x) + 2}$$
Because ${x\choose m}$ sends $\mathbb Z$ to $\mathbb Z$, hence being a continuous function (a polynomial) it must send the closure of $\mathbb Z$ to itself, i.e. ${x\choose m}\in\mathbb Z_p$ whenever $x\in\mathbb Z_p$, therefore when $m\ge n+2$, we are done. When $m\le n+1\le p^n$, for each $1\le k <m$, $\nu_p(k)\le n \le \nu_p(x)$, hence $\nu_p(x-k)\ge\max(\nu_p(x), \nu_p(k))\ge \nu_p(k)$, in other words $\frac{x-k}{k}\in\mathbb Z_p$, so
$$\nu_p({x \choose m} p^m)\ge \nu_p(x) + m -\nu_p(m)\ge n + m -\nu_p(m)$$
We show $m-\nu_p(m)\ge 2$. When $\nu_p(m)=0$, it depends on the assumption $m\ge 2$. If $\nu_p(m) = k\ge 1$, $m=p^km'$, $$m-\nu_p(m)=p^k m' - k\ge p^k - k$$ And $p^k -k \ge 2$ can be easily shown by induction on $k$ (Note that the base case $k=1$ depends on the assumption $p\ge 3$).
We skip proving $x\mapsto (1+p)^x$ is a homomorphism. The lemma is useful in proving both injectivity and surjectivity. For injectivity, if $(1+p)^x = 1$ and $x\not=0\Leftrightarrow \nu_p(x)<\infty$, by our lemma $(1+p)^x\equiv 1+ px \mod p^{\nu_p(x)+2}$, and since $\nu_p(xp)=\nu_p(x)+1\Rightarrow px\not\equiv 0 \mod p^{\nu_p(x)+2}$, we have $(1+p)^x\not\equiv 1\mod p^{\nu_p(x)+2}$.
For surjectivity, note that $1+p^n\mathbb Z_p$ is a multiplicative subgroup of $\mathbb Z_p^{\times}$, $(1+pa)^{-1} = 1+pb$ for some $b$. When $n=1$, we have $(1+p)(1+pa)^{-1} = (1+p)(1+pb)\in 1+p\mathbb Z_p$, so the base case is solid, and now we finish the induction step.
That is, we already have $(1+p)^{x_n}(1+pb)=1+p^n y$ for some $y\in Z_p$, to build $x_{n+1}$, we decompose $x_{n+1} = x_n + z$ for some $z\in\mathbb Z_p$ to be constructed, so
$$(1+p)^{x_{n+1}}(1+pb) = (1+p)^{x_n + z}(1+pb) = (1+p)^{x_n}(1+pb)(1+p)^z = (1+p^ny)(1+p)^z$$
If we can find $z$ such that $(1+p)^z \equiv 1-p^n y \mod p^{n+1}$, we are done. By our lemma $z=-p^{n-1}y$ works.
To work out the proof by showing $(1+p)^{x_n}\equiv 1+pa \mod p^n$, similar to above, we need to find $z$ such that
$$(1+p)^{x_n+z}-(1+pa) = (1+pa+p^ny)(1+p)^z - (1+pa)\in p^{n+1}\mathbb Z_p$$
This is possible ($z=\frac{-p^{n-1}y}{1+pa}$) but definitely harder, so I don't think it's a typo.
In again Serre's A Course in Arithmetic, he established the result by analyzing $U_n/U_{n-1}$ where $U_n:=1+p^n\mathbb Z_p$, so it's more natural to exploit the multiplicative structure of $1+p\mathbb Z_p$ rather than the additive one.