Page 126 - Video Coding for Mobile Communications Efficiency, Complexity, and Resilience
P. 126
Section 4.5. Frequency-Domain Methods 103
where F t and F t−@t are the FTs of the current and reference frames, respec-
tively. In Ref. 88, Haskell noticed this relationship but did not propose an
algorithm to recover the displacement from the phase shift.
If we de ne @ (w x ;w y ) as the phase di erence between the FT of the
current frame and that of the reference frame, then
e j@ (w x ;w y ) = e j[ t (w x ;w y )− t−@t (w x ;w y )]
= e j t (w x ;w y ) · e −j t−@t (w x ;w y )
F t (w x ;w y ) F ∗ (w x ;w y )
= · t−@t ; (4.24)
∗
|F t (w x ;w y )| |F t−@t (w x ;w y )|
where t and t−@t are the phase components of F t and F t−@t , respectively,
and the superscript ∗ indicates the complex conjugate. If we de ne c t; t−@t (x; y)
as the inverse FT of e j@ (w x ;w y ) , then
−1 j@ (w x ;w y )
c t; t−@t (x; y)= F {e }
−1 j t (w x ;w y ) −j t−@t (w x ;w y )
= F {e · e }
−1 j t (w x ;w y ) −1 −j t−@t (w x ;w y )
= F {e }⊗ F {e }; (4.25)
where ⊗ is the 2-D convolution operation. In other words, c t; t−@t (x; y)isthe
cross-correlation of the inverse FTs of the phase components of F t and F t−@t .
For this reason, c t; t−@t (x; y) is known as the phase correlation function. The
importance of this function becomes apparent if it is rewritten in terms of the
phase di erence in Equation (4.23):
−1 j@ (w x ;w y )
c t; t−@t (x; y)= F {e }
−1 j(−w x d x −w y d y )
= F {e }
= (x − d x ;y − d y ): (4.26)
Thus, the phase correlation surface has a distinctive impulse at (d x ;d y ). This
observation is the basic idea behind the phase correlation motion estimation
method. In this method, Equation (4.24) is used to calculate e j@ (w x ;w y ) , the
inverse FT is then applied to obtain c t; t−@t (x; y), and the location of the
impulse in this function is detected to estimate (d x ;d y ).
In practice, the impulse in the phase correlation function degenerates into
one or more peaks. This is due to many factors, like the use, in digital images,
of the discrete Fourier transform (DFT) instead of the FT, the presence of more
than one moving object within the considered area A, and the presence of