Page 85 -
P. 85
44 The Optimization Verification test
Suppose you are building a speech recognition system. Your system works by inputting an
audio clip A, and computing some Score (S) for each possible output sentence S. For
A
example, you might try to estimate Score (S) = P(S|A), the probability that the correct
A
output transcription is the sentence S, given that the input audio was A.
Given a way to compute Score (S), you still have to find the English sentence S that
A
maximizes it:
How do you compute the “arg max” above? If the English language has 50,000 words, then
there are (50,000) possible sentences of length N—far too many to exhaustively enumerate.
N
So, you need to apply an approximate search algorithm, to try to find the value of S that
optimizes (maximizes) Score (S). One example search algorithm is “beam search,” which
A
keeps only K top candidates during the search process. (For the purposes of this chapter, you
don’t need to understand the details of beam search.) Algorithms like this are not guaranteed
to find the value of S that maximizes Score (S).
A
Suppose that an audio clip A records someone saying “I love machine learning.” But instead
of outputting the correct transcription, your system outputs the incorrect “I love robots.”
There are now two possibilities for what went wrong:
1. Search algorithm problem. The approximate search algorithm (beam search) failed
to find the value of S that maximizes Score (S).
A
2. Objective (scoring function) problem. Our estimates for Score (S) = P(S|A) were
A
inaccurate. In particular, our choice of Score (S) failed to recognize that “I love machine
A
learning” is the correct transcription.
Depending on which of these was the cause of the failure, you should prioritize your efforts
very differently. If #1 was the problem, you should work on improving the search algorithm.
If #2 was the problem, you should work on the learning algorithm that estimates Score (S).
A
Facing this situation, some researchers will randomly decide to work on the search
algorithm; others will randomly work on a better way to learn values for Score (S). But
A
unless you know which of these is the underlying cause of the error, your efforts could be
wasted. How can you decide more systematically what to work on?
Page 85 Machine Learning Yearning-Draft Andrew Ng