12

Knowing the frequencies of each symbol, is it possible to determine the maximum height of the tree without applying the Huffman algorithm? Is there a formula that gives this tree height?

Raphael
  • 73,212
  • 30
  • 182
  • 400
user7060
  • 475
  • 5
  • 12

3 Answers3

4

Huffman coding (asymptotically) gets within one bit of the entropy of the sequence. This means that if you calculate the entropy of your symbol frequencies, you will be (asymptotically) within one bit of the average length (i.e. height) of your code. You can use this average to bound the longest length (on average), or you can use combinatorial methods to get determinsitic bounds.

Ari Trachtenberg
  • 652
  • 4
  • 10
1

According to this paper,

it is shown that if $p$ is the range $0<p\leq 1/2$, and if $K$ is the unique index such that $1/F_{K+3}< p \leq 1/F_{K+2}$, where $F_K$ denotes the $K$-th Fibonacci number, then the longest Huffman codeword for a source whose least probability is $p$ is at most $K$, and no better bound is possible.

use std::io;

fn find_k(p: f64) -> Option<usize> { let (mut a, mut b, mut c) = (0, 1, 1); // starting with the first three Fibonacci numbers

let mut i = 0;
loop {
    if (1.0/c as f64) &lt; p &amp;&amp; p &lt;= (1.0/b as f64) {
        return Some(i - 1);
    }
    // rolling the Fibonacci sequence
    let temp_a = b;
    let temp_b = c;
    c = b + c;
    a = temp_a;
    b = temp_b;

    i += 1;

// if i > 1e299 as usize { // added a safety measure to prevent potential infinite loops // break; // } }

return None; // K not found for given constraints

}

fn main() { let mut input = String::new(); io::stdin().read_line(&mut input).unwrap(); let p: f64 = input.trim().parse().unwrap(); input.clear();

match find_k(p) {
    Some(k) =&gt; println!(&quot;{}&quot;, k),
    None =&gt; println!(&quot;No suitable K found.&quot;)
}

}

Try it online!


138 Aspen
  • 101
  • 4
0

The pathological case would be when the sorted symbol frequency resembles that of Fibonacci sequence. N:= # of symbols. for N>2, max possible height: N-1. for N == 1 or 2: 1

Bill Liu
  • 9
  • 1