Page 241 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 241

218                         Chapter 9.  Error-Resilience  Video Coding  Techniques


            layer contains a coarsely quantized version of video, whereas the enhancement
            layers carry the error between the original version and this coarsely quantized
            version. This is known as SNR scalability. Another form is spatial scalability.
            This is very similar to SNR scalability. The only di erence is that pictures in
            the  base  layer  are  subsampled  to  a  smaller  size.  Yet  another  form  of  layered
            coding is known as data partitioning. In this case, the base layer contains vital
            video information like headers, motion vectors, and low-frequency DCT coef-
             cients.  Other  information,  like  high-frequency  DCT  coe$cients,  is  included
            in the enhancement layers.
               Note that all these forms of layered coding are supported in recent standard-
            ization e orts. For example, MPEG-4 supports temporal and spatial scalability
            in  addition  to  data  partitioning.  H.263+  supports  temporal,  SNR,  and  spatial
            scalability  in annex O, and H.263++ supports  data partitioning  in annex V.

            9.6.5  Multiple Description Coding

            This  technique  assumes  that  there  are  multiple  channels  between  the  encoder
            and  the  decoder.  These  multiple  channels  can  be  physically  distinct  paths  or
            they  can  be  a  single  path  divided  into  multiple  virtual  channels  using,  for
            example,  time  or  frequency  division.  The  technique  further  assumes  that  the
            error  events  of  these  multiple  channels  are  independent.  This  means  that  the
            probability that all channels  simultaneously experience  errors is very small.
               Similar  to  layered  coding,  multiple  description  coding  encodes  video  into
            multiple streams known as descriptions. In this case, however, the descriptions
            are correlated and have equal importance. The requirement that all descriptions
            have equal importance means that the descriptions must share some fundamen-
            tal  information  about  the  input  video.  As  a  consequence  of  this  information
            sharing, the descriptions  are  correlated.
               At  the  encoder,  each  description  is  transmitted  on  a  di erent  channel.  As
            already  mentioned,  the  error  events  of  the  channels  are  independent.  As  a
            result,  at  least  one  description  will  be  received  at  the  decoder  without  errors.
            This  description  carries  some  fundamental  information  about  the  transmitted
            video  and  can,  therefore,  be  used  to  provide  a  basic  level  of  quality.  Since
            the  descriptions  are  correlated,  missing  descriptions  can  be  estimated  from
            correctly received  descriptions  and the quality can be improved.
               There are a number of methods to achieve the required decomposition into
            descriptions.  For  example,  in  Ref.  188,  the  input  signal  is  decomposed  and
            encoded  into  two  streams.  The  two  streams  are  obtained  by  transmitting  two
            quantization  indices  for  each  quantized  level.  The  index  assignment  is  de-
            signed  such  that  when  both  indices  are  received,  the  reconstruction  quality
            is  equivalent  to  that  of  a   ne  quantizer.  When,  however,  only  one  index  is
            received, the reconstruction quality is equivalent to that of a coarse quantizer.
   236   237   238   239   240   241   242   243   244   245   246