Page 53 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 53
30 Chapter 2. Video Coding: Fundamentals
Input ·
frame Segment f ( , x ) y Forward F ( u ,v ) F ( , u ) v Symbol
into N × N transform Quantizer encoder
blocks
(a) Encoder
· Reconstructed
F ( , u ) v ˆ f (x , ) y frame
Symbol Inverse Combine
decoder transform N × N blocks
(b) Decoder
Figure 2.9: Block diagram of a transform coding system
A unitary space-frequency transform is applied to each block to produce an
16
N × N block of transform (spectral) coeGcients that are then suitably quan-
tized and coded. At the decoder, an inverse transform is applied to reconstruct
the frame. The main goal of the transform is to decorrelate the pels of the
input block. This is achieved by redistributing the energy of the pels and con-
centrating most of it in a small set of transform coeGcients. This is known as
energycompaction. The transform process can also be interpreted as a coor-
dinate rotation of the input or as a decomposition of the input into orthogonal
basis functions weighted by the transform coeGcients [29]. Compression comes
about from two main mechanisms. First, low-energy coeGcients can be dis-
carded with minimum impact on the reconstruction quality. Second, the HVS
has di6ering sensitivity to di6erent frequencies. Thus, the retained coeGcients
can be quantized according to their visual importance.
When choosing a transform, three main properties are desired: good en-
ergy compaction, data-independent basis functions, and fast implementation.
The Karhunen-LoVeve transform (KLT) is the optimal transform in an energy-
compaction sense. Unfortunately, this optimality is due to the fact that the
KLT basis functions are dependent on the covariance matrix of the input
block. Recomputing and transmitting the basis functions for each block is
a nontrivial computational task. These disadvantages severely limit the use
of the KLT in practical coding systems. The performance of many subopti-
mal transforms with data-independent basis functions have been studied [30].
Examples are the discrete Fourier transform (DFT), the discrete cosine trans-
form (DCT), the Walsh-Hadamard transform (WHT), and the Haar transform.
It has been demonstrated that the DCT has the closest energy-compaction
performance to that of the optimum KLT [30]. This has motivated the de-
velopment of a number of fast DCT algorithms, e.g., Ref. 31. Due to these
16 A unitary transform is a reversible linear transform with orthonormal basis functions [29].