3

Suppose there are two independent sequences of Bernoulli Random Variables $\{X_i\}_{1}^{n}$ and $\{Y_i\}_{1}^{n}$ with $P(X_i=1)=p_1$ and $P(Y_i=1)=p_2$. Let $\hat{p_1} = \frac{\sum_{i=1}^{n} X_i}{n}$ and $\hat{p_2} = \frac{\sum_{i=1}^{n} Y_i}{n}$. Define $$T= \frac{\hat{p_2}-\hat{p_1}}{\sqrt{\frac{2 \hat{p} \hat{q}}{n}}}$$ where $\hat{p}=\frac{\hat{p_1}+\hat{p_2}}{2}$ and $\hat{q}=1-\hat{p}$. Suppose $$\Pi_n = P(T < -z_{\frac{\alpha}{2}})$$.

Here $Z_{\alpha}$: upper $\alpha$ point of $N(0,1)$.

For the special case $p_1=p_2$, show that $\Pi_n$ converges to $\frac{\alpha}{2}$ as $n \rightarrow \infty$.

I found that $$\frac{\hat{p_2}-\hat{p_1}}{\sqrt{\frac{2 p_1 (1-p_1)}{n}}} \sim^{a} N(0,1)$$. But the question tries to estimate the denominator by $\frac{2 \hat{p} \hat{q}}{n}$. I tried calculating $E(\hat{p} \hat{q})$ which comes out to be $\frac{2n-1}{2n} p_1(1-p_1)$ under $p_1=p_2$.

Intuitively all I need to show is that $T$ follows a $t$-distribution and then as $n \rightarrow \infty$, the quantile of the $t$-distribution converges to a normal quantile. But, I cannot establish the fact that the denominator's estimator is actually a chi-squared random variable.

Anyone with a different idea?

Kroki
  • 13,619

1 Answers1

1

A complete answer:

$$\sqrt{n} \left(\left(\widehat{p}_1, \widehat{p}_2\right)-\left(p_1, p_2\right)\right)\to \mathcal N \left(0, \begin{bmatrix} p_1\left(1-p_1\right) & 0 \\ 0 & p_2\left(1-p_2\right) \end{bmatrix}\right)$$

Let $$g\left(p_1, p_2\right) = \frac{p_1-p_2}{\sqrt{\frac{p_1+p_2}{2}\left(1-\frac{p_1+p_2}{2}\right)}}$$

so $$\nabla g\left(p_1, p_2\right) = \frac1{\left(\frac{p_1+p_2}2\left(1-\frac{p_1+p_2}{2}\right)\right)^{\frac32}} \begin{bmatrix}\frac{p_1+p_2}2\left(1-\frac{p_1+p_2}{2}\right)-\frac{p_1-p_2}2\left(\frac12 - \frac{p_1+p_2}{2}\right)\\ -\frac{p_1+p_2}2\left(1-\frac{p_1+p_2}{2}\right)-\frac{p_1-p_2}2\left(\frac12 - \frac{p_1+p_2}{2}\right)\end{bmatrix}$$

By delta method:

$$\sqrt{n} \left(g\left(\widehat p_1, \widehat p_2\right) - g\left(p_1, p_2 \right)\right) \to \mathcal N\left(0, \nabla g\left(p_1, p_2\right)^T\begin{bmatrix} p_1\left(1-p_1\right) & 0 \\ 0 & p_2\left(1-p_2\right) \end{bmatrix}\nabla g\left(p_1, p_2\right)\right)$$

The variance is easy to compute and even easier when $p_1 = p_2 = p$:

$$\nabla g\left(p, p\right)^T\begin{bmatrix} p\left(1-p\right) & 0 \\ 0 & p\left(1-p\right) \end{bmatrix}\nabla g\left(p, p\right) = 2$$

Kroki
  • 13,619