Page 255 - Programming Microcontrollers in C
P. 255
240 Chapter 5 Programming Large 8-Bit Systems
code is shown in Table 5-1. Frequently occurring letters like e, i, or a
are encoded with only two or three bits. On the other hand, letters that
occur very infrequently—like c or v—require 9 bits to encode. Even
though some letters require more than 8 bits to encode, the average
number of bits per character for this particular message set is only 4.032.
Decoding a Huffman code with a program is relatively easy. The
program must contain the Huffman tree with all of its nodes and leaves.
If we want the entire alphabet and a space, a period, a comma and an
end of message, there will be 59 entries in the tree. This or an equivalent
tree is an overhead that must be carried for every Huffman program.
Assume that the statistical analysis of the messages along with the
construction of the Huffman tree is complete. The approach taken to
build this tree in memory is to allow one byte for each node. The tree
is searched from the root node, which is the lowest address. The message
to be decoded is examined. If the bit in the message is a zero, the node
pointer is incremented by one and the next bit is examined. If the bit in
the message is a one, the node pointer is incremented by the content of
the node. All printable characters are entered into the tree as negative
numbers. The contents of the node are examined. If the content of the
node is positive, the next bit from the message is examined and so
forth. Whenever the content of a node is negative, its most significant
bit is removed, and the character is sent to the output. At that time, the
node pointer is returned to the root node, and the sequence is repeated
until an end of message is found.
Character Code Character Code
E 00 D 110000
I 01 ‘ ‘ 110001
A 100 G 111100
‘\n’ 101 U 111101
N 11001 L 1111100
R 11010 O 1111101
T 11011 F 11111100
M 11100 H 11111101
S 11101 P 11111110
C 111111110
V 111111111
Table 5-1: Huffman Code