9

Randomized Meldable Heaps have an operation "meld", which we then use to define all other operations, including insert.

The question is, what is an expected height of that tree with $n$ nodes?

Theorem 1 of Gambin and Malinkowski, Randomized Meldable Priority Queues (Proceedings of SOFSEM 1998, Lecture Notes in Computer Science vol. 1521, pp. 344–349, 1998; PDF) gives the answer to this question with proof. However, I do not understand why we can write: $$\mathbb{E} [ h_Q] = \frac{1}{2} ((1 + \mathbb{E}[h_{Q_L}]) + (1 + \mathbb{E}[h_{Q_R}]))\,.$$

For me the height of the tree is

$$h_Q = 1 + \max\, \{ h_{Q_L}, h_{Q_R}\}\,,$$

which I can expand to:

$$\mathbb{E} [ h_Q] = 1 + \mathbb{E}[\max \,\{ h_{Q_L}, h_{Q_R}\}] = 1 + \sum k \mathbb{P}[\max\, \{ h_{Q_L}, h_{Q_R}\} = k]\,.$$

The probability that the maximum of a height of two subtrees is equal to $k$ can be rewritten using the law of total probability:

\begin{align*} \hspace{2em}&\hspace{-2em} \mathbb{P}[\max\, \{ h_{Q_L}, h_{Q_R}\} = k] \\ &= \mathbb{P}[\max\, \{ h_{Q_L}, h_{Q_R}\} = k \mid h_{Q_L}\leq h_{Q_R}]\,\mathbb{P}[h_{Q_L} \leq h_{Q_R}]\\ &\hspace{2em} + \mathbb{P}[\max\, \{ h_{Q_L}, h_{Q_R}\} = k \mid h_{Q_L} > h_{Q_R}]\,\mathbb{P}[h_{Q_L} > h_{Q_R}] \\ &= \mathbb{P}[h_{Q_R} = k \mid h_{Q_L} \leq h_{Q_R}]\,\mathbb{P}[h_{Q_L} \leq h_{Q_R}]\\ &\hspace{2em}+ \mathbb{P}[h_{Q_L} = k \mid h_{Q_L} > h_{Q_R}]\,\mathbb{P}[h_{Q_L} > h_{Q_R}]\,. \end{align*}

So at the end I get:

\begin{align*}\mathbb{E} [ h_Q] &= 1 + \sum k \{ \mathbb{P}[h_{Q_R} = k \mid h_{Q_L} \leq h_{Q_R}]\,\mathbb{P}[h_{Q_L} \leq h_{Q_R}]\\ &\hspace{2em}+ \mathbb{P}[h_{Q_L} = k \mid h_{Q_L} > h_{Q_R}]\,\mathbb{P}[h_{Q_L} > h_{Q_R}] \}\,. \end{align*}

This is where I am stuck. I can see that $\mathbb{P}[h_{Q_L} > h_{Q_R}]$ is more or less equal $\frac{1}{2}$ (However we need at most $\leq \frac{1}{2}$). But except that nothing leading to the formula from the beginning.

The heights of the subtrees do not seem to be independent to me.

Thanks for help.

Raphael
  • 73,212
  • 30
  • 182
  • 400

1 Answers1

4

In the paper, $h_Q$ isn't the height. It's the length of a random walk away from the root in a full binary tree (they insist every leaf is "nil"), so the expression they have is the right thing.

Also, you can avoid induction. The probability of ending at a specific leaf of depth $d$ is just $2^{-d}$. So the expected length of the walk is

$$ \sum_{\ell\in \text{leaves}(Q)} \text{depth}(\ell)2^{-\text{depth}(\ell)} $$

which the entropy of a distribution a set of size $|\text{leaves}(Q)|$.

Louis
  • 2,946
  • 17
  • 25