Page 97 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 97
74 Chapter 3. Video Coding: Standards
reconstruct the output video. As shown, at various points of this encoding-
decoding process, users are allowed to interact with (access and=or manipulate)
the individual VOs.
As an example, consider a sequence showing a hot-air balloon Jying in
the sky. In this case, the sequence can be represented using two VOs: the
balloon and the sky background. Figure 3.9(a) shows a single frame of this
sequence. At this particular instance of time the two VOs are represented by
the two VOPs shown in Figures 3.9(b) and 3.9(c). At the encoder, each VOP
is encoded individually and the two bitstreams are multiplexed. At the decoder,
the received bitstream is demultiplexed to the two individual bitstreams. Each
bitstream is then decoded to reconstruct the corresponding VOP. The two
VOPs are then put together to reconstruct the transmitted frame. The user can
optionally manipulate the decoded VOPs. For example, in Figure 3.9(d) the
balloon VOP has been enlarged, rotated, and translated as compared to the
original frame.
In addition to composition information (which indicates where and when
the VOP is to be displayed), each VOP is encoded in terms of its shape,
(a) Balloon in Sky (original) (b) Sky (background) VOP
(c) Balloon VOP (d) Decoded and manipulated
Figure 3.9: Object-based representation, coding, and interaction