1

Let $X = (x_1, x_2)$ and $Y = (y_1, y_2)$ where the random variables $x_1$, $x_2$, $y_1$, $y_2$ are independent standard normal. What is the expected distance between $X$ and $Y$, i.e. what is $$D_2=E\left(\sqrt{(x_1 - y_1)^2 + (x_2 - y_2)^2}\right)\ ?$$ Does this value increase or decrease when the number $n$ of dimensions increases, that is when $X = (x_1, \cdots, x_n)$ and $Y = (y_1, \cdots, y_n)$ for some independent standard normal random variables $x_i$, $y_i$, and $n>2$? Does it converge when $n \rightarrow \infty$?

Note: I do not know if the resulting integrals are tractable (I would suspect that they are not).

Did
  • 284,245
Sid
  • 4,422
  • 1
    It's not clear whether the components are independent... – leonbloy Jul 04 '16 at 13:33
  • "in other words" is incorrect. The independence of the variables is a fact beyond what is stated in the first part of the sentence. – joriki Jul 04 '16 at 13:35
  • @leonbloy Forgot to clarify that they are i.i.d., fixed now. – Sid Jul 04 '16 at 13:36
  • @joriki Feel free to imagine that the previous part of the sentence said so as well. Or edit the paragraph so that it does. Either way, I hope the meaning of the question is clear now. – Sid Jul 04 '16 at 13:41
  • 1
    It seems clear that the expected distance is an increasing function of the dimension $n$, increasing to infinity, and equivalent to $\sqrt{2n}$.. – Did Jul 04 '16 at 13:47
  • The density of $R=|X|$ is proportional to $r^{n-1}\mathrm e^{-r^2/2}$. The density of the component $Y_\parallel$ of $Y$ parallel to $X$ is proportional to $\mathrm e^{-y_\parallel^2/2}$, and the density of the component $Y_\perp$ of $Y$ perpendicular to $X$ is proportional to $y_\perp^{n-2}\mathrm e^{-y_\perp^2/2}$. – joriki Jul 04 '16 at 13:50
  • 1
    Thus the expected value is

    $$ \mathbb E\left[|X-Y|\right]=\frac{\int_0^\infty\mathrm drr^{n-1}\mathrm e^{-r^2/2}\int_{-\infty}^\infty\mathrm dy_\parallel\mathrm e^{-y_\parallel^2/2}\int_0^\infty\mathrm dy_\perp y_\perp^{n-2}\mathrm e^{-y_\perp^2/2}\sqrt{(r-y_\parallel)^2+y_\perp^2}} {\int_0^\infty\mathrm drr^{n-1}\mathrm e^{-r^2/2}\int_{-\infty}^\infty\mathrm dy_\parallel\mathrm e^{-y_\parallel^2/2}\int_0^\infty\mathrm dy_\perp y_\perp^{n-2}\mathrm e^{-y_\perp^2/2}};. $$

    That's not a nice integral.

    – joriki Jul 04 '16 at 13:50
  • @Did, isn't this a consequence of LLN? – zhoraster Jul 04 '16 at 14:02
  • The distribution should be obtainable from a scaling of the chi distribution whose moments one can look up. – André Nicolas Jul 04 '16 at 14:14
  • @zhoraster Indeed it is. – Did Jul 04 '16 at 16:04

3 Answers3

1

I do not understand why this problem reappears without a complete solution seven years later. However it is not hard. Suppose that $X,Y,Z$ are independent and $N(0,I_n)$ distributed. Since $X-Y\sim N(0,2I_n)$ then $E(\|X-Y\|)=2^aE(\|Z\|)$ with $a=n/2.$ We use the formula $$\sqrt{r}=\int_{0}^{\infty}\frac{1}{2\sqrt{\pi}s^{3/2}}(1-e^{-sr})ds$$ (derive in $r$ to check it) applied to $r=\|Z\|^2.$ We get $$E(\|Z\|)=\int_{0}^{\infty}\frac{1}{2\sqrt{\pi}s^{3/2}}\left(1-\frac{1}{(1+2s)^a}\right)ds$$$$=\int_{0}^{\infty}\frac{1}{2\sqrt{\pi}s^{3/2}}\left(\int_1^{1+2s}\frac{dx}{ax^{a+1}}\right)ds$$$$=\int_{0}^{\infty}\frac{1}{\sqrt{\pi}s^{1/2}}\left(\int_0^{1}\frac{du}{a(1+2su)^{a+1}}\right)ds$$$$=\frac{1}{a\sqrt{\pi}}\int_{0}^{1}\left(\int_0^{\infty}\frac{1}{s^{1/2}}\frac{ds}{(1+2su)^{a+1}}\right)du$$$$=\frac{1}{a\sqrt{\pi}}\int_{0}^{1}\left(\int_0^{\infty}\frac{1}{v^{1/2}}\frac{dv}{(1+v)^{a+1}}\right)\frac{du}{\sqrt{2u}}=\sqrt{2}\frac{\Gamma(\frac{n+1}{2}) }{\Gamma(\frac{n}{2})}.$$

0

If everything is independent, then this is no longer a two-point problem: the variables $x_i - y_i$ are independent $N(0,2)$. The expectation is not terribly hard to compute, see here. It is increasing (moreover, the distribution itself is increasing, in terms of stochastic order) and, as @Did commented, equivalent to $\sqrt{2n}$.

zhoraster
  • 26,086
0

Here is a simplified example for $n=2$ which might help you to start out.


Assume an archer is shooting an arrow onto an infinite target with center $(0,0)$. The arrow hits at the coordinates $(X,Y)$ where $X$ and $Y$ are independent and $X,Y\sim\mathcal{N}(0,1)$. It should be clear that

$$ f_{X,Y}(x,y)=\frac{\exp(-1/2\cdot(x^2+y^2))}{2\pi} $$

since $X$ and $Y$ are independent. Let $Z$ be a random variable describing the distance from the center - since $Z<0$ does not make sense we have $F_Z(z)=0$ for $z<0$. Therefore we can assume $z\geq 0$ now. Now with some steps we can get

\begin{align*} F_Z(z) &= \Pr[Z\leq z] = \Pr[(X,Y)\in B_z(0)] = \int_{B_z(0)}f_{X,Y}(s,t)\,\mathrm{d}s\,\mathrm{d}t\\ &= \int_0^z\int_0^{2\pi}r\cdot f_{X,Y}(r\cos\Theta,r\sin\Theta)\,\mathrm{d}\Theta\,\mathrm{d}r\\ &= \int_0^z\int_0^{2\pi}r\cdot \frac{\exp(-1/2\cdot(r^2\cos^2\Theta+r^2\sin^2\Theta))}{2\pi}\,\mathrm{d}\Theta\,\mathrm{d}r\\ &= \int_0^z\int_0^{2\pi}r\cdot \frac{\exp(-1/2\cdot r^2)}{2\pi}\,\mathrm{d}\Theta\,\mathrm{d}r\\ &= \int_0^zr\cdot \frac{\exp(-1/2\cdot r^2)}{2\pi}\int_0^{2\pi}\,\mathrm{d}\Theta\,\mathrm{d}r\\ &= \int_0^zr\cdot \frac{\exp(-1/2\cdot r^2)}{2\pi}\cdot 2\pi\,\mathrm{d}r\\ &= \int_0^z r\cdot\exp(-1/2\cdot r^2)\,\mathrm{d}r \end{align*}

which yields

\begin{align*} f_Z(z) = F_Z'(z)=\begin{cases} r\cdot\exp(-1/2\cdot r^2),&z\geq 0,\\ 0,&\text{otherwise}. \end{cases} \end{align*}

Now it follows that

\begin{align*} \mathbb{E}[Z] &= \int_{-\infty}^\infty t\cdot \left (t\cdot\exp(-1/2\cdot t^2)\right )\,\mathrm{d}t\\ &= \left[-\exp(-1/2\cdot t^2)\cdot t\right |_0^\infty-\int_0^\infty -\exp(-1/2\cdot t^2)\,\mathrm{d}t \\ &= 0 + \sqrt{2\pi}\int_0^\infty\frac{\exp(-1/2\cdot t^2)}{\sqrt{2\pi}}\,\mathrm{d}t\\ &= \sqrt{2\pi}\cdot \frac{1}{2} = \sqrt{\frac{\pi}{2}}. \end{align*}

Hence the archer will miss the center by about $\sqrt{\pi/2}$ units.