Let $\{X_n\}$ be a sequence of i.i.d. real random variables. We say $X_n$ converges to a real random variable $X$ in probability if for all $\epsilon > 0$ $$\lim_{n\rightarrow \infty} P(|X_n - X| < \epsilon) = 0$$ and $X_n$ converges to $X$ a.e. if $$P(\lim_{n \rightarrow \infty} X_n = X) = 1.$$
From measure theory, I've always thought of convergence in measure as saying the measure of the set $$\{x \mid |X_n - X| > \epsilon\}$$ decreases with $n$, but the set might "jump around" (e.g. the well known sliding block example as discussed in this question). In contrast, a.e. convergence requires this set to be fixed in some sense.
On Terence Tao's blog he says
...roughly speaking, convergence in probability is good for controlling how a single random variable $X_n$ is close to its putative limiting value $X$, while almost sure convergence is good for controlling how the entire tail $(X_n)_{n \geq N}$ of a random sequence of random variables is close to its putative limit $X$.
I have never thought about convergence in measure/probability as describing a single element $X_n$ in the sequence $\{X_n\}$. What does Tao mean by this? Also, how does a.e. convergence control the entire tail of the sequence?