Page 172 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 172
Section 6.3. Long-Term Memory Motion-Compensated Prediction 149
reference frame. For example, at a frame skip of 4, the prediction gain when
using a multiframe memory of size M = 50 frames is 1:87 dB for AKIYO,
2:17 dB for FOREMAN, and 1:25 dB for TABLE TENNIS, compared to single-
reference prediction (i.e., M = 1). Such prediction gains are mainly due to
the long-term statistical dependencies of video sequences. Examples of such
dependencies are the repetitions of sequence content due to uncovered objects
or objects reappearing in the sequence. An interesting point to note here is
that the prediction gains increase with increased frame skip. For example, for
AKIYO when going from M =1 to M = 50, the prediction gain is 0:62 dB at
a frame skip of 1 and 1:87 dB at a frame skip of 4. This may be due to the
fact that as the frame skip increases, successive frames get more decorrelated.
This increases the chance that a frame other than the immediately preceding
one will be chosen and, consequently, gives more chance to bene)t from long-
term memory prediction. In Ref. 136, the bene)ts of extending LTM-MCP to
half-pel accuracy are discussed. It is shown that further prediction gains can
be achieved by moving from full- to half-pel accuracy. This “accuracy gain”
is comparable to that in the case of single-reference prediction.
It should be emphasized that the improved prediction quality of LTM-MCP
is achieved at the expense of:
1. Increased memory requirements at both the encoder and the decoder.
2. Additional bit rate to transmit the new extra components, d t , of motion
vectors.
3. Increased computational complexity at the encoder.
Item 1 is not a major drawback due to the rapid drop in the price of memory
chips, item 2 will be investigated further in Section 6.3.3, whereas a possible
solution for item 3 will be proposed in Chapter 8.
6.3.3 E*ciency at Very Low Bit Rates
As already discussed in Section 6.1, LTM-MCP extends the motion vector
of a block by a third component, d t . This is the temporal displacement or
the index into the multiframe memory. Obviously, the transmission of this
extra component incurs an additional bit rate compared to the single-reference
case. This additional bit rate has to be justi)ed in terms of an improvement
in the rate-distortion (R-D) performance. This subsection investigates the R-D
performance of the LTM-MCP technique. Particular emphasis is given to the
e ciency of this technique at the very low bit rates typical of mobile video
communication. Four H.263-like encoders were implemented:
SR This is a single-reference encoder. It uses full-pel full-search block match-
ing with macroblocks of 16 × 16 pels, a maximum allowed spatial