10

Suppose we pick $n$ random points in the unit cube in $\mathbb{R}_3$, $p_1=\left(x_1,y_1,z_1\right),$ $p_2=\left(x_2,y_2,z_2\right),$ etc. (So, $x_i,y_i,z_i$ are $3n$ uniformly distributed random variables between $0$ and $1$.) Let $\Gamma$ be a complete graph on these $n$ points, and weight each edge $\{p_i,p_j\}$ by $$w_{ij}=\sqrt{\left(x_i-x_j\right)^2+\left(y_i-y_j\right)^2+\left(z_i-z_j\right)^2}.$$

Question: What is the expected value of the total weight of a minimal spanning tree of $\Gamma$?

(Note: Here total weight means the sum of all edges in the minimal spanning tree.)

A peripheral request: The answer is probably a function of $n$, but I don't have the computing power or a good implementation of Kruskall's algorithm to suggest what this should look like. If someone could run a simulation to generate this average over many $n$, it might help towards a solution to see this data.

2 Answers2

2

According to the Wikipedia article on the Euclidean minimum spanning tree (or EMST), the expected total weight is asymptotic to $$ c(d)n^{\frac{d-1}{d}}\int_{\mathbb{R}^{d}}f(x)^{\frac{d-1}{d}}dx $$ as $n\rightarrow\infty$, where $d$ is the dimensionality (here, $d=3$), $f(x)$ is the probability density of point selection (here, $f(x)=1$ inside the cube), and $c(d)$ depends only on the dimensionality. For uniform selection within the unit cube, the expected size should satisfy $$ E(n) \sim c_{3}n^{2/3} $$ for some constant $c_{3}$.

Numerical experiments (for modest $n$, up to about $8000$) suggest that $c_{3}=0.65 \pm 0.01$.

mjqxxxx
  • 43,344
  • If the minimum spanning tree were non-Euclidean, how would expected size formula differ? Thank you. – Frank Feb 20 '17 at 00:23
  • What about the rectilinear non-Euclidean minimum spanning tree expected size? Is there a back of the envelope calculation for Euclidean minimum spanning tree expected size? Thank you. – Frank Feb 20 '17 at 07:05
  • By rectilinear I assume you mean Manhattan or taxicab metric? I don't know; you might get a good response if you pose it as a new stand-alone question. If I had to guess, it'd still scale as $n^{2/3}$, with a different prefactor. – mjqxxxx Feb 20 '17 at 19:31
  • Thank you for your comment. In addition, I would like to find out why expected size E(n)∼ n / log (n) when an independent random number generator is used to derive the edge weights according to a [0,1] uniform probability distribution. – Frank Feb 21 '17 at 04:06
0

If $n = 0$ or $n = 1$ the answer obviously is 0. If $n = 2$ we have $$E\left((x_1 - x_2)^2\right) = E(x_1^2 - 2x_1x_2 + x_2^2) = E(x_1^2) - 2E(x_1)\cdot E(x_2) + E(x_2^2) \\= \frac13 - 2\frac12\cdot\frac12 + \frac13 = \frac16.$$ The same for $y$- and $z$-coordinates. So $E(w_{12}) = \sqrt{\frac16 + \frac 16 + \frac16} = \frac1{\sqrt2}$ and spanning tree contains the edge $\{\,1, 2\,\}$ only.

I see it is possible to consider several cases for $n = 3$, however for arbitrary $n$ I don't expect to get close form of the answer.

Smylic
  • 8,098