3

As a practice problem, I am trying to prove the relationship between the Normal Distribution and the Binomial Distribution. I have seen several proofs of this before (e.g. Justifying the Normal Approx to the Binomial Distribution through MGFs), but I wanted to try and do this myself. Specifically, I wanted to try and do this primarily using Moment Generating Functions.

Here are my steps and where I got stuck:

  1. Let $X$ be a binomially distributed random variable with parameters $n$ and $p$.

  2. For a binomial distribution: $$E(X) = np$$ $$Var(X) = np(1-p)$$

  3. Define a standardized version of $X$: $$Z = \frac{X - np}{\sqrt{np(1-p)}}$$ Goal: Show that as $n$ approaches infinity, the distribution of $Z$ approaches $N(0,1)$.

  4. I wanted to try and do this using Moment Generating Functions (MGF). The MGF of $Z$ is: $$M_Z(t) = E[e^{tZ}]$$

  5. Substituting the definition of $Z$: $$M_Z(t) = E\left[\exp\left(\frac{t(X - np)}{\sqrt{np(1-p)}}\right)\right]$$

  6. The MGF of the binomial distribution $X$ is (Finding the Moment Generating function of a Binomial Distribution): $$M_X(t) = (pe^t + 1-p)^n$$

  7. Now we do the following manipulations: $$M_Z(t) = E\left[\exp\left(\frac{t(X - np)}{\sqrt{np(1-p)}}\right)\right]$$

    $$M_Z(t) = E\left[\exp\left(\frac{tX}{\sqrt{np(1-p)}} - \frac{tnp}{\sqrt{np(1-p)}}\right)\right]$$

Since $e^{a+b} = e^a \cdot e^b$, we can write: $$M_Z(t) = E\left[\exp\left(\frac{tX}{\sqrt{np(1-p)}}\right) \cdot \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right)\right]$$

The second term doesn't involve X, so we can take it out of the expectation: $$M_Z(t) = E\left[\exp\left(\frac{tX}{\sqrt{np(1-p)}}\right)\right] \cdot \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right)$$

First term: $$E\left[\exp\left(\frac{tX}{\sqrt{np(1-p)}}\right)\right] = E\left[\exp\left(X \cdot \frac{t}{\sqrt{np(1-p)}}\right)\right]$$

This is the definition of the moment-generating function of X, evaluated at $$\frac{t}{\sqrt{np(1-p)}}$$: $$E\left[\exp\left(X \cdot \frac{t}{\sqrt{np(1-p)}}\right)\right] = M_X\left(\frac{t}{\sqrt{np(1-p)}}\right)$$

Putting it all together: $$M_Z(t) = M_X\left(\frac{t}{\sqrt{np(1-p)}}\right) \cdot \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right)$$

  1. Substituting the binomial MGF: $$M_Z(t) = \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(p\exp\left(\frac{t}{\sqrt{np(1-p)}}\right) + (1-p)\right)^n$$

  2. Use Taylor expansion of

$$exp\left(\frac{t}{\sqrt{np(1-p)}}\right)$$

$$e^x \approx 1 + x + \frac{x^2}{2} + o(x^2)$$: $$M_Z(t) \approx \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(p\left(1 + \frac{t}{\sqrt{np(1-p)}} + \frac{t^2}{2np(1-p)}\right) + (1-p)\right)^n$$

  1. Simplify: $$M_Z(t) \approx \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(1 + \frac{pt}{\sqrt{np(1-p)}} + \frac{pt^2}{2np(1-p)}\right)^n$$

11) I start to stumble at these steps.

Intuitively, I am guessing that since this term has an exponent to the power of $n$, this needs to be somehow removed. I have read that the Binomial Expansion can be used and keep terms up to $t^2$:

$$ (a + b)^n = \sum_{k=0}^{n} \binom{n}{k} a^{n-k} b^k $$ $$(1 + t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n})^n $$ where $ a = 1 $ and $ b = t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n} $

$$ (1 + b)^n = 1 + nb + \frac{n(n-1)}{2}b^2 + ... $$ $$ (1 + b)^n = 1 + n\left(t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n}\right) + \frac{n(n-1)}{2}\left(t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n}\right)^2 + ... $$

$$M_Z(t) \approx \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(1 + t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n}\right)^n$$

  1. Expand the expression: $$M_Z(t) \approx \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(1 + n\left(t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n}\right) + \frac{n(n-1)}{2}\left(t\sqrt{\frac{p}{n(1-p)}} + \frac{t^2}{2n}\right)^2 + ...\right)$$

  2. Simplify, keeping terms up to $t^2$: $$M_Z(t) \approx \exp\left(-\frac{tnp}{\sqrt{np(1-p)}}\right) \cdot \left(1 + t\sqrt{\frac{np}{1-p}} + \frac{t^2}{2} + \frac{t^2p}{2(1-p)} + ...\right)$$

14) This is where I really got stuck. Intuitively, I know that I need to show that the limit as n approaches infinity and show that this MGF is equal to the MGF of a Standard Normal: $$\lim_{n\to\infty} M_Z(t) ??? $$

I think I have made mistakes because the first term looks like Exponent of negative infinity which is 0 by definition.

Can someone please help me out here?

konofoso
  • 681
  • The central limit theorem is probably an easier and better to generalize approach. – Peter Jul 13 '24 at 05:10
  • @ Peter: thanks for the tip! I thought about using CLT, but I really wanted to try and do this using MGFs ...I am wondering if I have reached a dead end? – konofoso Jul 13 '24 at 05:11
  • Did you miss a $(1-p)$ factor in the denominator of the 2nd term of your $b$? – Zack Fisher Jul 13 '24 at 06:17
  • @ Zack Fisher: I am looking at my handwritten notes that I transcribed .... it looks like I did ... I am about to log off and go to sleep ...will correct in the morning ... do you think that adding this $b$ term will make the limit solvable? – konofoso Jul 13 '24 at 06:20
  • Probably no. I'd keep $t$ fixed and let $n$ grow, rather than make $t$ small, as in your 12 and 13. – Zack Fisher Jul 13 '24 at 06:58
  • Decided to stay up and keep working... so you think in the end its better to abandon the MGF approach? – konofoso Jul 13 '24 at 07:44
  • No. Just on the other hand, I think the MGF/CGF convergence is a canonical way to understand CLM, and should be the preferred general approach. – Zack Fisher Jul 13 '24 at 15:48

2 Answers2

5

For the second factor in step 8, let $v=1/n$ , and take logarithm to get $$ \frac{1}{v} \log\left\lbrace 1 + u(v)\right\rbrace, $$ where $$ u(v)=p\left( \exp\left\lbrace \frac{\sqrt{v}t}{\sqrt{p(1-p)}} \right\rbrace-1\right). $$ By expanding the exponential $$ u(v)=p\left( \frac{\sqrt{v}t}{\sqrt{p(1-p)}} + \frac{vt^2}{{2p(1-p)}} + o(v) \right), $$ we see $u(v)=O\left(v^{1/2}\right)$. So to counteract the $1/v$ factor, the logarithm needs to be expanded to 2nd order, i.e., $$ \log\left\lbrace 1+u(v) \right\rbrace = u(v) - \frac{u^2(v)}{2} + o\left[u^2(v)\right]. $$ Plug $u(v)$ expansion into this and simplify to get $$ \frac{1}{v}\log\left\lbrace 1+u(v) \right\rbrace = \sqrt{\frac{p}{1-p}}\frac{t}{\sqrt{v}} + \frac{t^2}{2} + o(1) = \sqrt{\frac{np}{1-p}}t+ \frac{t^2}{2} + o(1). $$ Put this back to the $M_Z(t)$ result in step 8, $$ M_Z(t)=\exp\left\lbrace -\sqrt{\frac{np}{1-p}}t \right\rbrace \exp\left\lbrace \sqrt{\frac{np}{1-p}}t+ \frac{t^2}{2} + o(1) \right\rbrace = \exp\left\lbrace \frac{t^2}{2}+o(1) \right\rbrace, $$ which is the standard normal MGF in the limit.

Zack Fisher
  • 2,481
2

Consider the Binomial PMF: $$ P(X=k)=\binom{n}{k} p^k(1-p)^{n-k} $$

We will use Stirling's Approximation: $$ n!\approx \sqrt{2 \pi n}\left(\frac{n}{e}\right)^n $$

...for the binomial coefficient:

$$ \binom{n}{k} \approx \frac{1}{\sqrt{2 \pi k(n-k)}} \frac{n^n}{k^k(n-k)^{n-k}} $$

Now, substituting this back into the binomial PMF:

$$ P(X=k) \approx \frac{1}{\sqrt{2 \pi k(n-k)}} \frac{n^n}{k^k(n-k)^{n-k}}\left(\frac{p^k}{k^k}\right)\left(\frac{(1-p)^{n-k}}{(n-k)^{n-k}}\right) $$

Take the natural logarithm to simplify the multiplication: $$ \begin{aligned} \ln P(X=k) \approx -\frac{1}{2} \ln (2 \pi k(n-k))+n \ln n-k \ln k-(n-k) \ln (n-k)+k \ln p+& (n-k) \ln (1-p) \end{aligned} $$

Let's approximate $k$ near the expected value $n p$. Let $k=n p+x \sqrt{n p(1-p)}$, where $x$ is a small deviation from the mean. Substitute $k \approx n p$ :

Expand Around the Mean: Use the Taylor expansion for $\ln (1+y) \approx y-\frac{y^2}{2}$ for small $y$ : $$ \ln (n p+x \sqrt{n p(1-p)}) \approx \ln (n p)+\frac{x \sqrt{n p(1-p)}}{n p}-\frac{(x \sqrt{n p(1-p)})^2}{2(n p)^2} $$

Similarly, for $\ln (n-n p-x \sqrt{n p(1-p)}) \approx \ln (n-n p)-\frac{x \sqrt{n p(1-p)}}{n-n p}+\frac{(x \sqrt{n p(1-p)})^2}{2(n-n p)^2}$ Substituting the Expansions: Substitute these expansions back into the logarithm expression, and focus on the dominant terms: $$ \ln P(X=k) \approx-\frac{1}{2} \ln (2 \pi n p(1-p))-\frac{x^2}{2} $$

Exponentiate to Get Back to PMF: Revert back by exponentiating the logarithm: $$ P(X=k) \approx \frac{1}{\sqrt{2 \pi n p(1-p)}} \exp \left(-\frac{x^2}{2}\right) $$

This is now the PDF of a normal distribution: $$ P(X=k) \approx \frac{1}{\sqrt{2 \pi n p(1-p)}} \exp \left(-\frac{(k-n p)^2}{2 n p(1-p)}\right) $$

Yikes.

vallev
  • 1,213
  • @ vallev: thank you for this answer! I also saw an approach involving Stirling's Approximation ... but I was trying to do mine purely using MGFs ... is my way possible? thank you so much ... – konofoso Jul 13 '24 at 05:26