8

Suppose random variables $X$ and $Y$ are i.i.d. Normal$(0,1)$. Consider the following events, where $\varepsilon>0, c>0$:

$$\begin{align*} Q&=\{(x,y)\in\Bbb R^2: x>c, y>c\}\\ C&=\{(x,y)\in Q:y<x-\varepsilon\}\\ D&=\{(x,y)\in Q:x-\varepsilon<y<x\}\\ D'&=\{(x,y)\in Q:x<y<x+\varepsilon\}\\ C'&=\{(x,y)\in Q:y>x+\varepsilon\}\\ \end{align*}$$

relevant events in the (x,y)-plane

Note that $Q$ is the "shifted 1st-quadrant" with origin shifted to the point $(c,c)$ on the diagonal line $y=x$.

Now consider the probability that the random point $(X,Y)$ falls in the narrow diagonal strip $D\cup D^\prime$ (with $\varepsilon>0$ arbitrarily small), given that it falls in the remote "shifted 1st-quadrant" $Q$ (with $c>0$ increasingly large). Denoting this probability $p(\varepsilon,c)$, I will (later) derive the following formula:

$$p(\varepsilon,c)=1-2\,{\int_{c+\varepsilon}^\infty\phi(x)\Phi(x-\varepsilon)\,dx-\bar\Phi(c+\varepsilon)\Phi(c) \over \bar\Phi(c)^2}\tag{1}$$

where $\phi$ and $\Phi$ are the standard normal p.d.f. and c.d.f., respectively, and $\bar\Phi=1-\Phi$.

Since I could find no way to simplify the integral in this equation, I evaluated it numerically to obtain the following results -- with corresponding results for i.i.d. Laplacian random variables shown for comparison/contrast:

p vs. c for Normal and Laplacian

Question: How can it be proved that (as suggested by the above results) $\lim\limits_{c\to\infty}p(\varepsilon,c)=1$ for all $\varepsilon>0$ (or even for some particular $\varepsilon>0$)?

Actually, my immediate question is whether it's even plausible that no matter how thin the strip $D\cup D^\prime$, the point $(X,Y)$ is practically certain to fall there, given that it falls in a "shifted 1st-quadrant" $Q$ sufficiently far out on the diagonal!?

If this is the correct asymptotic behavior, it seems sufficiently "paradoxical" that surely it (or something similar) would be mentioned in the literature. (?)

NB: I've double-checked the precision of the numerical integration, and I also ran Monte Carlo simulations that confirm (though with fewer cases) the behavior shown in the above plots.


Derivation of Eq.(1)

(This is "spin-off" from my answer to a related question.)

$$\begin{align*} p(\varepsilon,c)& := P(D\cup D^\prime\mid Q)\\ &\overset{1}{=}2\,P(D\mid Q)\\ &\overset{2}{=}2\,\left({1\over2}- P(C\mid Q)\right)\\ &=1-2\,{P(C)\over P(Q)}\\[2ex] &=1-2\,{\int_{c+\varepsilon}^\infty\phi(x) \int_c^{x-\varepsilon}\phi(y)\,dy\, dx \over \bar\Phi(c)^2}\\[2ex] &=1-2\,{\int_{c+\varepsilon}^\infty\phi(x)\left(\Phi(x-\varepsilon)-\Phi(c)\right)\, dx \over \bar\Phi(c)^2}\\[2ex] p(\varepsilon,c)&=1-2\,{\int_{c+\varepsilon}^\infty\phi(x)\Phi(x-\varepsilon)\,dx-\bar\Phi(c+\varepsilon)\Phi(c) \over \bar\Phi(c)^2} \end{align*}$$

where the equalities (1) and (2) are justified because ...

  1. The joint p.d.f. is symmetric and centered on the origin $O$.
  2. $P(C\cup D\mid Q)=P(C^\prime\cup D^\prime\mid Q)={1\over2}P(Q\mid Q)={1\over2}$, again because of the symmetry of the joint p.d.f.
r.e.s.
  • 15,537
  • Since you just want to evaluate the limit, you do not need to compute the exact "closed" form - just apply the L'Hospital's rule on the fraction? – BGM Jul 05 '24 at 02:22

2 Answers2

7

The strip is a bit misleading here. The point is that, as $c$ grows, the probability gets concentrated around the point $(c, c)$, which is always contained in the strip.

We can use this intuition to find a lower bound for the probability. Namely, we can use the fact that $$ (c, c+\epsilon)^2 \subseteq \{(x, y) : x-\epsilon < y < x +\epsilon, x, y > c\} $$ to compute \begin{align*} \Pr(X - \epsilon < Y < X + \epsilon | X, Y > c) &\geq \Pr(X, Y < c + \epsilon | X, Y > c) \\ &= \Pr(X < c + \epsilon | X > c)^2 \\ &= \biggl[1 - \frac{\Pr(X > c + \epsilon)}{\Pr(X > c)}\biggr]^2. \end{align*}

Now, we can use standard Gaussian tail bounds to compute the latter quantity (see, eg. here for a reference). Namely, we use $$ \biggl(\frac{1}{x} - \frac{1}{x^3}\biggr)\frac{e^{-x^2/2}}{\sqrt{2\pi}} < \Pr(X > x) < \frac{1}{x}\frac{e^{-x^2/2}}{\sqrt{2\pi}}, $$ from which we can calculate \begin{align*} \frac{\Pr(X > c + \epsilon)}{\Pr(X > c)} &\leq \frac{(c+\epsilon)^{-1} e^{-(c + \epsilon)^2/2}}{(c^{-1} - c^{-3}) e^{-c^2/2}} \\ &= \frac{\exp\{-c\epsilon - \epsilon^2/2\}}{(1 + \epsilon/c)(1 - c^{-2})} \\ &\rightarrow 0. \end{align*}

Thus, we have that \begin{align*} \liminf_{c \rightarrow \infty}\Pr(X - \epsilon < Y < X + \epsilon | X, Y > c) &\geq \liminf_{c \rightarrow \infty} \biggl[1 - \frac{\Pr(X > c + \epsilon)}{\Pr(X > c)}\biggr]^2 = 1, \end{align*} and so the result follows.

  • Very nice, and intuitive! (There's an inconsequential typo where it should be $c^{-1}-c^{-3}$. Also, although $(c,c)$ is always on the boundary of the strip $D\cup D^\prime$, it's not in it -- but again that's of no consequence.) – r.e.s. Jul 05 '24 at 13:44
  • After @BGM's answer, I notice that L'Hôpital's rule would also have easily given your $\lim_{c\to\infty}\frac{\Pr(X > c + \epsilon)}{\Pr(X > c)}=0$. (I thought it worth mentioning.) – r.e.s. Jul 05 '24 at 23:07
  • Ah yes, thanks for pointing these out! Should be fixed now – Damian Pavlyshyn Jul 06 '24 at 01:39
3

Just to show the computation via L'Hospital's rule

From the formula (1) derived in the OP, $$ \lim_{c\to\infty} p(c, \varepsilon) = 1 \iff \lim_{c\to\infty} \frac {\displaystyle \int_{c+\varepsilon}^{\infty} \phi(x)\Phi(x - \varepsilon)dx - \bar{\Phi}(c + \varepsilon)\Phi(c)} {\bar{\Phi}(c)^2} \triangleq \lim_{c\to\infty} \frac {f(c)} {g(c)} = 0$$

Since $$ \lim_{c\to\infty} f(c) = \lim_{c\to\infty} \int_{c+\varepsilon}^{\infty} \phi(x)\Phi(x - \varepsilon)dx - \bar{\Phi}(c + \varepsilon)\Phi(c) = 0 - 0 = 0 $$ $$ \lim_{c\to\infty} g(c) = \lim_{c\to\infty} \bar{\Phi}(c)^2 = 0 $$

we have a $0/0$ indeterminacy form.

Both $f(c)$ and $g(c)$ are continuously differentiable, and $$ g'(c) = -2\bar{\Phi}(c)\phi(c)$$ is non-zero for any finite $c$, so we can apply L'Hospital's rule.

Then

$$ f'(c) = -\phi(c + \varepsilon)\Phi(c) - [-\phi(c + \varepsilon)\Phi(c) + \bar{\Phi}(c + \varepsilon)\phi(c)] = -\bar{\Phi}(c + \varepsilon)\phi(c) $$

So $$ \lim_{c\to\infty} \frac {f'(c)} {g'(c)} = \lim_{c\to\infty} \frac {-\bar{\Phi}(c + \varepsilon)\phi(c)} {-2\bar{\Phi}(c)\phi(c)} = \lim_{c\to\infty} \frac {\bar{\Phi}(c + \varepsilon)} {2\bar{\Phi}(c)} $$

So it is another $0/0$ indeterminacy form. Apply the L'Hospital's rule again to the above limit,

$$ \begin{align} \lim_{c\to\infty} \frac {-\phi(c + \varepsilon)} {-2\phi(c)} \\ &= \frac {\sqrt{2\pi}} {2\sqrt{2\pi}} \lim_{c\to\infty} \exp\left\{- \frac {(c + \varepsilon)^2 - c^2} {2} \right\} \\ &= \frac {1} {2} \lim_{c\to\infty} \exp\left\{- \frac {2\varepsilon c + \varepsilon^2} {2} \right\} \\ &= 0 \end{align} $$ for any $\varepsilon > 0$, which proves the claim.

Amir
  • 11,124
BGM
  • 7,803