3

I have to find the maximum, minimum, and average height of a BST with n nodes. After doing some researching I found that the maximum height is $n-1$ and the minimum height is $\log_2(n+1)-1$. My question is how do I get the average height of a BST with n nodes?

xskxzr
  • 7,613
  • 5
  • 24
  • 47
zSkeeter135
  • 43
  • 1
  • 3

1 Answers1

3

Ok, so we need to get a bit mathematical here. Let's first define the following quantities

  • $X_n$ height of a tree composed of $n$ nodes.
  • $Y_n = 2^{X_n}$ is referred to as the exponential height.

One of the BST's properties is that the left subtree must contain key values less than the root. Also, the right subtree contains key values greater than the root. This property is recursive so it applies to any node. Having said that, pick any node at random, call it the $i^{th}$ node, then, the left subtree has $i-1$ elements and the right subtree has $n-i$ elements. Therefore, $$Y_n = 2 \max (Y_{i-1},Y_{n-i})$$ Assuming all nodes are picked with equal probability, so picking a node $i$ has the following probability $$Pr(i) = \frac{1}{n} \quad \forall i=1\ldots n$$ Now, let us find the expected value of $Y_n$ \begin{align} E(Y_n) &= E[2 \max (Y_{i-1},Y_{n-i})]\\ &= 2 \sum\limits_{i=1}^n Pr(i)E[\max (Y_{i-1},Y_{n-i})] \\ &= \frac{2}{n} \sum\limits_{i=1}^n E[\max (Y_{i-1},Y_{n-i})] \\ &\leq \frac{2}{n} \sum\limits_{i=1}^n E(Y_{i-1}) + \frac{2}{n} \sum\limits_{i=1}^n E(Y_{n-i}) \\ &= \frac{2}{n} \sum\limits_{i=0}^{n-1} E(Y_{i}) + \frac{2}{n} \sum\limits_{i=0}^{n-1} E(Y_i) \\ &= \frac{4}{n} \sum\limits_{i=0}^{n-1} E(Y_i) \end{align}

Now, by induction, you can actually prove \begin{equation} E(Y_n) \leq \frac{1}{4} C_{n+3}^3 \end{equation} Remember that we have \begin{equation} 2^{X_n} = Y_n \end{equation} or \begin{equation} E(2^{X_n}) = E(Y_n) \end{equation} Using Jensen's inequality, we can say that \begin{equation} 2^{E(X_n)}\leq E(Y_n) \end{equation} But \begin{equation} E(Y_n) \leq \frac{1}{4} C_{n+3}^3 \end{equation} So \begin{equation} 2^{E(X_n)}\leq E(Y_n) \leq \frac{1}{4} C_{n+3}^3 \end{equation} And note that \begin{equation} C_{n+3}^3 = \frac{(n+3)!}{3!n!} = \frac{(n+3)(n+2)(n+1)}{6} \sim O(n^3) \end{equation} So \begin{equation} 2^{E(X_n)} \leq O(n^3) \end{equation} Take log on both sides, \begin{equation} E(X_n) \leq O(\log n^3) = O(3 \log n) = O(\log n) \end{equation} Therefore, the average BST height is of order $O(\log n)$, which is what you'd expect out of Binary stuff, right ?

Ahmad Bazzi
  • 228
  • 1
  • 7