0

Let $H(n)$ be the average height of a BST with nodes from ${1,...,n}$. I think that $$H(n) = \frac{1}{n}\sum_{i = 0}^{n-1}\left[\text{max}(H(i), H(n-1 -i)) + 1\right]$$ But I don't know how to prove this is correct. Can anyone help me or explain why this is incorrect?

Note proving it directly seems hard because you end up with E[max(X, Y)] where X is height of left subtree and Y height of right subtree. I'm not really sure how to deal with this if we're looking for an exact recursion, not just a bound.

2 Answers2

2

The formula is wrong.

Indeed, we can find by hand that $H(1) = 0$, $H(2) = 1$ and $H(3) = \frac95$ ($5$ different trees, $4$ of them being of height $2$, the last of height $1$).

But: $$\frac13\sum\limits_{i=0}^2(\max(H(i), H(2 - 1 - i)) + 1) = \frac13(H(2) + H(1) + H(2) + 3) = \frac53\neq\frac95$$

The reason is that in your formula, you do not weigh each $H(i)$ by the number of trees of size $i$.

Nathaniel
  • 18,309
  • 2
  • 30
  • 58
0

In addition to the upper bound in https://cs.stackexchange.com/a/96451/12322

Lower bound

$\displaystyle E(Y_n)= \frac{2}{n} \sum^n_{i=0}(E[2\max(Y_{i−1},Y_{n−i})] + 1) \geq \frac{2}{n} \sum^{n-1}_{i=0} E[Y_{i}]$

Subhankar Ghosal
  • 328
  • 2
  • 11