In the paper by Comer (1979), he shows this equation for the minimum number of nodes in a B-Tree
$$ \sum_{l=0}^{h}{d^{l}=\frac{d^{h}-1}{2d-1}} $$
I tried to get this result myself by using the geometric sum formula on the series, $2, 2d, 2d^2,2d^3, \dots,2d^h$, which gave me, $$ \frac{2(d^{h}-1)}{d-1} $$ I thought I made a mistake, but when I tested both formulas, it looked like mine works better. Here's a table for a B-Tree of order-2 ($d=2$) showing heights $0,1,2,3$. Note: Both Comer and I have omitted the root node from the calculation.
| Height | Expected Nodes | Comer's Formula | My Formula |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 1 | 2 | 1/3 (Non-Integer) | 2 |
| 2 | 6 = 2 + 4 | 1 | 6 |
| 3 | 14 = 2 + 4 + 8 | 7/3 (Non-Integer) | 14 |
Moreover, Comer later shows an inequality for the a tree with $n$ nodes: $$2d(\frac{d^{h}-1}{d-1})\leq n$$
This inequality appears to use a different formula for the minimum node count, furthermore, Comers shows the result $$ 2d^{h}\leq n + 1 $$ And $$ h \leq \log_d {\frac{n+1}{2}} $$
I can't figure out how Comer got this result, either with my formula based on the geometric sum formula, or his equations which he'd stated prior. So my question is:
- How did Comer derive the mathematical formulas shown in his 1979 paper?
Comer's paper linked here: https://dl.acm.org/doi/pdf/10.1145/356770.356776