Page 97 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 97

74                                      Chapter 3.  Video Coding:  Standards


            reconstruct  the  output  video.  As  shown,  at  various  points  of  this  encoding-
            decoding process, users are allowed to interact with (access and=or manipulate)
            the individual VOs.
               As  an  example,  consider  a  sequence  showing  a  hot-air  balloon  Jying  in
            the  sky.  In  this  case,  the  sequence  can  be  represented  using  two  VOs:  the
            balloon  and  the  sky  background.  Figure  3.9(a)  shows  a  single  frame  of  this
            sequence.  At  this  particular  instance  of  time  the  two  VOs  are  represented  by
            the two VOPs shown in Figures 3.9(b) and 3.9(c). At the encoder, each VOP
            is encoded individually and the two bitstreams are multiplexed. At the decoder,
            the received bitstream is demultiplexed to the two individual bitstreams. Each
            bitstream  is  then  decoded  to  reconstruct  the  corresponding  VOP.  The  two
            VOPs are then put together to reconstruct the transmitted frame. The user can
            optionally  manipulate  the  decoded  VOPs.  For  example,  in  Figure  3.9(d)  the
            balloon  VOP  has  been  enlarged,  rotated,  and  translated  as  compared  to  the
            original frame.
               In  addition  to  composition  information  (which  indicates  where  and  when
            the  VOP  is  to  be  displayed),  each  VOP  is  encoded  in  terms  of  its  shape,

















                     (a)  Balloon in Sky (original)   (b)  Sky (background) VOP














                        (c) Balloon VOP              (d) Decoded and manipulated
                       Figure  3.9:  Object-based representation,  coding, and interaction
   92   93   94   95   96   97   98   99   100   101   102