What is the probability that a binomial variable is far from its mean?

Question

Let $(p_i)_i$ be i.i.d. Bernoulli variables with $P(p_i = 1) = 1/d$ and $P(p_i = 0) = 1 - 1/d$, where $d \geq 2$ is a fixed integer. Then $E(p_i) = 1/d$ and therefore by the strong law of large numbers,

$$ \overline{p}_n = \frac{p_0 + \cdots + p_{n-1}}{n} \to \frac{1}{d} $$

almost surely as $n \to \infty$. In particular, this implies that for each fixed $\varepsilon \in (0, 1/d)$,

$$ P\Big(\overline{p}_n < \frac{1}{d} - \varepsilon\Big) \to 0 $$

as $n \to \infty$. Moreover, the convergence should be exponential, in the sense that there exists a positive $\eta = \eta(\varepsilon) > 0$ such that

$$ P\Big(\overline{p}_n < \frac{1}{d} - \varepsilon\Big) = \mathcal{O}(e^{-\eta n}) $$

My question is: What is $\eta$ in terms of $\varepsilon$ and $d$? Can we compute the following?

$$ \eta(\varepsilon) = \lim_{n\to\infty} -\frac{1}{n}\log P\Big(\overline{p_n} < \frac{1}{d} - \varepsilon\Big) $$

In the case that $d = 2$, this has the interpretation of summing binomial coefficients in the sense that

$$ \sum_{k=0}^{\alpha n} \binom{n}{k} \leq 2^{H(\alpha)n} $$

where $\alpha \in (0,1/2)$ and $H(\alpha) = -\alpha \log_2 \alpha - (1-\alpha) \log_2(1-\alpha)$ is the binary entropy function. This would imply in this case that $\eta(\varepsilon) \leq H(1/2 - \varepsilon)$.

I wondered if there existed a similar bound for summing multinomial coefficients, though I am not familiar enough with probability theory to know how to compute this above limit (or how to prove the bound in the $d = 2$ case).

Have you tried using the Central Limit Theorem? $\frac{\bar p_n}{n}$ tends to normal distribution with mean $\frac{1}{d}$ and standard deviation $\sqrt\frac{(1/d)(1-1/d)}{n}$. We can they apply the tail bound that tail area of normal is less than $x$ times normal density. — Jayanth R Varma, Mar 14 '25 at 05:27

score 1 · Answer 1 · answered Mar 14 '25 at 10:32

To stick to the usual formalism of the large deviations, denote $Y_n=1-X_n$ and $p=1-\frac{1}{d}=1-q$. You consider for $s>0$

$$\Pr(Y_1+\cdots+Y_n>n(p+\epsilon))=\Pr(e^{s(Y_1+\cdots+Y_n)}\geq e^{ns(p+\epsilon)})\leq e^{-ns(p+\epsilon)}E\left(e^{s(Y_1+\cdots+Y_n)}\right)=\left((pe^s+q)e^{-s(p+\epsilon)}\right)^n$$ by Markov inequality $E(U>u)\leq \frac{E(U)}{u}$ for $U,u>0.$ Chosing the minimizing $s$ you get $$\Pr(\overline {Y_n}>p+\epsilon)\leq e^{-n\frac{q(p+\epsilon)}{p(q-\epsilon)}}$$ Actually the Large Deviation Theorem says that furthermore $$\lim\frac{1}{n}\log\Pr(\overline {Y_n}>p+\epsilon)=\frac{q(p+\epsilon)}{p(q-\epsilon)}.$$ The proof is elementary but too long to be given here.

What is the probability that a binomial variable is far from its mean?

1 Answers1