Page 165 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 165

142                    Chapter 6.  Multiple-Reference  motion Estimation  Techniques

            6.2  Multiple-Reference Motion Estimation: A Review


            In  multiple-reference  motion-compensated  prediction  (MR-MCP),  motion  es-
            timation  and  compensation  are  extended  to  utilize  more  than  one  reference
            frame.  The  reference  frames  are  assembled  in  a  multi-frame  memory  (or
            bu&er) that is maintained simultaneously at encoder and decoder. In this case,
            in addition to the spatial displacements (d x ;d y ), a motion vector is extended to
            also include a temporal displacement d t  . This is the index into the multiframe
            memory. The process  of  MR-MCP  is  illustrated in Figure  6.1.
               The main aim of MR-MCP is to improve coding e ciency. Thus, the refer-
            ence generation block in Figure 6.1(a) can utilize any technique that provides
            useful  data  for  motion-compensated  prediction.  Examples  of  such  techniques
            are reviewed in what follows.
               A number of MR-MCP techniques have been proposed for inclusion within
            MPEG-4.  Examples  are  global  motion  compensation  (GMC)  [131, 132],
            dynamic  sprites  (DS)  [132],  and  short-term  frame  memory=long-term  frame
            memory  (STFM/LTFM)  prediction  [133].  In  these  techniques,  MCP  is  per-
            formed  using  two  reference  frames.  The  )rst  reference  frame  is  always  the
            past  decoded  frame,  whereas  the  second  reference  frame  is  generated  using
            di&erent  methods.  In  GMC,  the  past  decoded  frame  is  warped  to  provide  the
            second reference frame. The technique of DS is a more general case of GMC.
            In DS, past decoded frames are warped and blended into a sprite memory. This
            sprite memory is used to provide the second reference frame. In STFM/LTFM
            two  frame  memories  are  used.  The  STFM  is  used  to  store  the  past  decoded
            frame,  whereas  the  LTFM  is  used  to  store  an  earlier  decoded  frame.  The
            LTFM  is  updated  using  a  refresh  rule  based  on  scene-change  detection.  Both
            DS and STFM/LTFM can bene)t from another MR-MCP technique, which is
            background memory  prediction  [134].
               Similar to the STFM/LTFM is the reference picture selection (RPS) mode
            included in annex N of H.263+ (refer to Chapter 3). In this mode, switching
            to a di&erent reference picture can be signaled at the picture level. It should be
            pointed out, however, that this option was designed for error resilience rather
            than  for  coding  e ciency.  Its  main  function  is  to  stop  error  propagation  due
            to transmission errors.
               Probably  the  most  signi)cant  contributions  to  the  )eld  of  MR-MCP  are
            those  made  by  Wiegand  and  Girod  et  al.  [135–141].  They  noted  [135, 136]
            that long-term statistical dependencies in video sequences are not exploited by
            existing video standards. Thus, they proposed to extend motion estimation and
            compensation  to  utilize  several  past  decoded  frames.  They  called  this  tech-
            nique  long-term-memory  motion-compensated  prediction  (LTM-MCP).  They
            demonstrated  that  the  use  of  this  technique  can  lead  to  signi)cant  improve-
            ments in coding e ciency.
   160   161   162   163   164   165   166   167   168   169   170