Page 128 - Digital Analysis of Remotely Sensed Imagery
P. 128
Storage of Remotely Sensed Data 99
3.4.3 LZW Coding
Initially devised by Ziv and Lempel (1977), and later supplemented
by Welch (1984), the LZW compression is an error-free compression
algorithm in which fixed-length codes are used to replace patterns
of variable-length sequences of pixel values in the original data.
Conceptually, this general purpose compression method is simple
and versatile. It does not require any a priori knowledge on the
occurrence probability of pixel values to be encoded. Instead, a code
table of 4096 entries is constructed. All encoded codes are 12 bits
long, each corresponding to one of the entries in the table. Codes 0
to 255 in the code table always represent single bytes from the input
image. Codes 256 through 4095 are represented as sequences of
bytes. For example, code 439 may represent the sequence of three
bytes. Each time the compression algorithm encounters this
sequence in the input file, code 439 is placed in the encoded file.
During uncompression code 439 is translated back to the true 3-byte
sequence via the code table. The longer the sequence assigned to a
single code, and the more often the sequence is repeated, the higher
the compression ratio.
During compression the computer reads the data from the input
stream and builds a code or translation table with the patterns as it
encounters them. Initially, there are only 256 entries in this code table,
the remainder of the table being blank. Thus, the first codes going
into the compressed file are single bytes from the input file being
converted to 12 bits. As the encoding proceeds, a new pattern is added
to the code table whenever it is encountered for the first time. At the
same time, its index is added to the output stream. When this pattern
is encountered since the last code-table refresh, its index from the
code table is put on the output stream, thus achieving data
compression. When the number of patterns detected by the compressor
in the input stream exceeds the number of patterns encodable with
the current number of bits, the number of bits per LZW code is
increased by one.
The compression process is made up of four steps:
• Definition of the number of bits needed to represent the actual
data. The first byte of the compressed data stream is a value
indicating the minimum number of bits required to represent
the set of actual pixel values. Normally, this will be the same
as the number of color bits.
• Compression of image pixels to a series of compression codes.
• Conversion of these compression codes into a string of 8-bit
bytes.
• Package sets of bytes into blocks preceded by character counts
and output. Each new pattern is entered into the code table
and its index is used to replace it in the compressed stream.