Page 165 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 165
142 Chapter 6. Multiple-Reference motion Estimation Techniques
6.2 Multiple-Reference Motion Estimation: A Review
In multiple-reference motion-compensated prediction (MR-MCP), motion es-
timation and compensation are extended to utilize more than one reference
frame. The reference frames are assembled in a multi-frame memory (or
bu&er) that is maintained simultaneously at encoder and decoder. In this case,
in addition to the spatial displacements (d x ;d y ), a motion vector is extended to
also include a temporal displacement d t . This is the index into the multiframe
memory. The process of MR-MCP is illustrated in Figure 6.1.
The main aim of MR-MCP is to improve coding e ciency. Thus, the refer-
ence generation block in Figure 6.1(a) can utilize any technique that provides
useful data for motion-compensated prediction. Examples of such techniques
are reviewed in what follows.
A number of MR-MCP techniques have been proposed for inclusion within
MPEG-4. Examples are global motion compensation (GMC) [131, 132],
dynamic sprites (DS) [132], and short-term frame memory=long-term frame
memory (STFM/LTFM) prediction [133]. In these techniques, MCP is per-
formed using two reference frames. The )rst reference frame is always the
past decoded frame, whereas the second reference frame is generated using
di&erent methods. In GMC, the past decoded frame is warped to provide the
second reference frame. The technique of DS is a more general case of GMC.
In DS, past decoded frames are warped and blended into a sprite memory. This
sprite memory is used to provide the second reference frame. In STFM/LTFM
two frame memories are used. The STFM is used to store the past decoded
frame, whereas the LTFM is used to store an earlier decoded frame. The
LTFM is updated using a refresh rule based on scene-change detection. Both
DS and STFM/LTFM can bene)t from another MR-MCP technique, which is
background memory prediction [134].
Similar to the STFM/LTFM is the reference picture selection (RPS) mode
included in annex N of H.263+ (refer to Chapter 3). In this mode, switching
to a di&erent reference picture can be signaled at the picture level. It should be
pointed out, however, that this option was designed for error resilience rather
than for coding e ciency. Its main function is to stop error propagation due
to transmission errors.
Probably the most signi)cant contributions to the )eld of MR-MCP are
those made by Wiegand and Girod et al. [135–141]. They noted [135, 136]
that long-term statistical dependencies in video sequences are not exploited by
existing video standards. Thus, they proposed to extend motion estimation and
compensation to utilize several past decoded frames. They called this tech-
nique long-term-memory motion-compensated prediction (LTM-MCP). They
demonstrated that the use of this technique can lead to signi)cant improve-
ments in coding e ciency.