Page 422 - Matrix Analysis & Applied Linear Algebra
P. 422
418 Chapter 5 Norms, Inner Products, and Orthogonality
Example 5.12.3
Filtering Noisy Data. The SVD can be a useful tool in applications involving
the need to sort through noisy data and lift out relevant information. Suppose
that A m×n is a matrix containing data that are contaminated with a certain
level of noise—e.g., the entries A might be digital samples of a noisy video or
audio signal such as that in Example 5.8.3 (p. 359). The SVD resolves the data
in A into r mutually orthogonal components by writing
r r
0
D r×r T T
A = U V = σ i u i v = σ i Z i , (5.12.11)
i
0 0
i=1 i=1
where Z i = u i v T and σ 1 ≥ σ 2 ≥· · · ≥ σ r > 0. The matrices {Z 1 , Z 2 ,..., Z r }
i
constitute an orthonormal set because
T 0if i = j,
Z i Z j = trace Z Z j =
i
1if i = j.
In other words, the SVD (5.12.11) can be regarded as a Fourier expansion as
described on p. 299 and, consequently, σ i = Z i A can be interpreted as the
proportion of A lying in the “direction” of Z i . In many applications the noise
contamination in A is random (or nondirectional) in the sense that the noise
is distributed more or less uniformly across the Z i ’s. That is, there is about as
much noise in the “direction” of one Z i as there is in the “direction” of any
other. Consequently, we expect each term σ i Z i to contain approximately the
same level of noise. This means that if SNR(σ i Z i ) denotes the signal-to-noise
ratio in σ i Z i , then
SNR(σ 1 Z 1 ) ≥ SNR(σ 2 Z 2 ) ≥ ··· ≥ SNR(σ r Z r ),
more or less. If some of the singular values, say, σ k+1 ,...,σ r , are small relative to
(total noise)/r, then the terms σ k+1 Z k+1 ,...,σ r Z r have small signal-to-noise
ratios. Therefore, if we delete these terms from (5.12.11), then we lose a small part
of the total signal, but we remove a disproportionately large component of the
k
total noise in A. This explains why a truncated SVD A k = σ i Z i can, in
i=1
many instances, filter out some of the noise without losing significant information
about the signal in A. Determining the best value of k often requires empirical
techniques that vary from application to application, but looking for obvious
gaps between large and small singular values is usually a good place to start.
The next example presents an interesting application of this idea to building an
Internet search engine.