Page 17 -
P. 17

6 Your dev and test sets should come from the

             same distribution



















             You have your cat app image data segmented into four regions, based on your largest
             markets: (i) US, (ii) China, (iii) India, and (iv) Other. To come up with a dev set and a test
             set, say we put US and India in the dev set; China and Other in the test set. In other words,
             we can randomly assign two of these segments to the dev set, and the other two to the test
             set, right?


             Once you define the dev and test sets, your team will be focused on improving dev set
             performance. Thus, the dev set should reflect the task you want to improve on the most: To
             do well on all four geographies, and not only two.

             There is a second problem with having different dev and test set distributions: There is a

             chance that your team will build something that works well on the dev set, only to find that it
             does poorly on the test set. I’ve seen this result in much frustration and wasted effort. Avoid
             letting this happen to you.

             As an example, suppose your team develops a system that works well on the dev set but not
             the test set. If your dev and test sets had come from the same distribution, then you would

             have a very clear diagnosis of what went wrong: You have overfit the dev set. The obvious
             cure is to get more dev set data.

             But if the dev and test sets come from different distributions, then your options are less
             clear. Several things could have gone wrong:

             1. You had overfit to the dev set.


             2. The test set is harder than the dev set. So your algorithm might be doing as well as could
                 be expected, and no further significant improvement is possible.







             Page 17                            Machine Learning Yearning-Draft                       Andrew Ng
   12   13   14   15   16   17   18   19   20   21   22