Page 241 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 241
218 Chapter 9. Error-Resilience Video Coding Techniques
layer contains a coarsely quantized version of video, whereas the enhancement
layers carry the error between the original version and this coarsely quantized
version. This is known as SNR scalability. Another form is spatial scalability.
This is very similar to SNR scalability. The only di erence is that pictures in
the base layer are subsampled to a smaller size. Yet another form of layered
coding is known as data partitioning. In this case, the base layer contains vital
video information like headers, motion vectors, and low-frequency DCT coef-
cients. Other information, like high-frequency DCT coe$cients, is included
in the enhancement layers.
Note that all these forms of layered coding are supported in recent standard-
ization e orts. For example, MPEG-4 supports temporal and spatial scalability
in addition to data partitioning. H.263+ supports temporal, SNR, and spatial
scalability in annex O, and H.263++ supports data partitioning in annex V.
9.6.5 Multiple Description Coding
This technique assumes that there are multiple channels between the encoder
and the decoder. These multiple channels can be physically distinct paths or
they can be a single path divided into multiple virtual channels using, for
example, time or frequency division. The technique further assumes that the
error events of these multiple channels are independent. This means that the
probability that all channels simultaneously experience errors is very small.
Similar to layered coding, multiple description coding encodes video into
multiple streams known as descriptions. In this case, however, the descriptions
are correlated and have equal importance. The requirement that all descriptions
have equal importance means that the descriptions must share some fundamen-
tal information about the input video. As a consequence of this information
sharing, the descriptions are correlated.
At the encoder, each description is transmitted on a di erent channel. As
already mentioned, the error events of the channels are independent. As a
result, at least one description will be received at the decoder without errors.
This description carries some fundamental information about the transmitted
video and can, therefore, be used to provide a basic level of quality. Since
the descriptions are correlated, missing descriptions can be estimated from
correctly received descriptions and the quality can be improved.
There are a number of methods to achieve the required decomposition into
descriptions. For example, in Ref. 188, the input signal is decomposed and
encoded into two streams. The two streams are obtained by transmitting two
quantization indices for each quantized level. The index assignment is de-
signed such that when both indices are received, the reconstruction quality
is equivalent to that of a ne quantizer. When, however, only one index is
received, the reconstruction quality is equivalent to that of a coarse quantizer.