0

The aim is to read in a file, calculate frequencies of each character and perform a huffman encoding, where the most common letters will be short binary codes i.e 001, and the least common will be longer, i.e 01000100.

I have created a linked list containing a sorted (in ascending order) list of all characters and their respective frequencies. This is passed to the function below. In this function I aimed to add the two lowest frequencies and build the binary tree like that until the length of the tree is 1. I'm unsure where to go from here, I know I have to look through the tree and see at which stage it goes left or right, then store a 0 (left) or 1 (right). - but i dont know how to build a function to do this!

void traverse_list(pqueue *list)
    {
    char letters[CHARACTERS] = { 0 };
    int frequencies[CHARACTERS] = { 0 };
    int j = 0, l = 0, len = 0;
    node *temp = list->head;
    tree *array[CHARACTERS];
    while (temp != NULL)
    {
        letters[j] = temp->letter;
        frequencies[j] = temp -> frequency;
        temp = temp->next;
        j++;
    }
    for (l = 0; l < CHARACTERS; l++)
    {
        if (frequencies[j])
        {
            tree* huffman = calloc(1, sizeof(tree));
            huffman -> letter = letters[l];
            huffman -> frequency = frequencies[l];
            array[len++] = huffman;
        }
    }

    while (len > 1)
    {
        tree* huffman = malloc(sizeof(tree));
        huffman -> left = array[len--];
        huffman -> right = array[len--];
        huffman -> frequency = huffman -> left -> frequency + huffman -> right -> frequency;
        array[len++] = huffman;
    } 
}

For easier reading the structs look like:

typedef struct Node
{
    char letter;
    int frequency;
    struct Node *next;

}node;

typedef struct pqueue
{
    node *head;

}pqueue;

typedef struct tree
{
    struct tree *left;
    struct tree *right;
    char letter;
    int frequency;
}tree;
Finlandia_C
  • 385
  • 6
  • 19

2 Answers2

1

I do not understand why you create so many arrays and then using them create new nodes again. I think this could be easily done by modifying the structure of Node. Something like this ::

typedef struct Node
{
    char letter;
    int frequency;
    struct Node *next;
    struct Node *left, *right;
}node;

So, you can do the following for forming a tree.

void huffman(plist *list) {
    while(1) {
        node *left = list->head;
        list->head = list->head->next;
        node *right = list->head;
        list->head = list->head->next;

        node *huffman = malloc(sizeof(node));
        huffman->frequency = left->frequency;
        huffman->left = left;
        huffman->right = right;
        huffman->next = NULL;

        if(list->head == NULL) {
            list->head = huffman;
            break;
        }
        insertHuffman(root, huffman);
    }
}

where your insertHuffman() would just insert the new node in the pList in sorted order. So, at the end you have just one node left in the tree, and then you can simply do a traversal to decide values at each node. You can definitely choose a better condition than while(1) which I used! :P I used it because that was the first thing that came to mind. And you would definitely be able to write insertHuffman() I believe.

EDIT::

void printHuffman(node *head, node *parent, char *a, int len) {
    if(head->left == NULL && head->right == NULL) {
        if(parent != NULL && parent->right == head) {
            cout << head->letter << " " << a << "1";
        } else if(parent != NULL && parent->left == head) {
            cout << head->letter << " " << a;
        }
    } else {
        a[len] = '0';
        printHuffman(head->left, head, a, len + 1);
        a[len] = '1';
        printHuffman(head->right, head, a, len + 1);
    }
}

I think this will print the values of the Huffman of every character.

Here, a is the character array of size CHARACTERS and all values initialized to \0 and len holds the value of current code.

EDIT 2 ::

I have seen the way you have tried combining characters tree nodes into 1 tree node, by taking the last two nodes from an ascending order sorted array, and combining them to make a new node which is put at the end to array. As of what I know about Huffman coding, you do not combine the elements with the max frequencies, but you rather combine the elements with the lowest frequencies and then form the tree which is used to find the Huffman codes.

user007
  • 2,156
  • 2
  • 20
  • 35
  • I appreciate this but the rest of the code builds around what i've done previously, I cant change it now it will take too long. The problem is I dont know how to traverse the tree to decide the values :( – Finlandia_C Dec 07 '15 at 16:50
  • @Finlandia_C adding two pointers to struct, I dont think that will affect much of the other code, you can leave it unused for other parts of the code, or you can just copy the `plist` you receive in the function, and then work on the copied list. I will update the answer about traversing the tree. – user007 Dec 07 '15 at 16:53
  • Thanks, if you could do traversal regarding how i've done, that way I will know both ways! – Finlandia_C Dec 07 '15 at 16:54
  • 1
    @Finlandia_C This doesn't go with what you are trying to do, but I think this shall work. I will see what you are trying to do with your code! – user007 Dec 07 '15 at 17:08
  • @Finlandia_C Check the answer, and share the huffman algo link you are using to make the huffman codes. – user007 Dec 07 '15 at 18:05
0

Try changing

        huffman -> left = array[len--];
        huffman -> right = array[len--];

to

        huffman -> left = array[--len];
        huffman -> right = array[--len];

in order to get the last element of the array.

MikeCAT
  • 73,922
  • 11
  • 45
  • 70
  • Thanks for that. but maybe I didn't word it correctly. I'm stuck on how to actually do the function which looks through the tree to see where it goes left/right and then how to assign these a binary code. – Finlandia_C Dec 07 '15 at 15:18