Page 74 -
P. 74

To describe the second effect in different terms, we can turn to the fictional character
             Sherlock Holmes, who says that your brain is like an attic; it only has a finite amount of
             space. He says that “for every addition of knowledge, you forget something that you knew
             before. It is of the highest importance, therefore, not to have useless facts elbowing out the
                           12
             useful ones.”
             Fortunately, if you have the computational capacity needed to build a big enough neural

             network—i.e., a big enough attic—then this is not a serious concern. You have enough
             capacity to learn from both internet and from mobile app images, without the two types of
             data competing for capacity. Your algorithm’s “brain” is big enough that you don’t have to
             worry about running out of attic space.

             But if you do not have a big enough neural network (or another highly flexible learning

             algorithm), then you should pay more attention to your training data matching your dev/test
             set distribution.

             If you think you have data that has no benefit,you should just leave out that data for
             computational reasons. For example, suppose your dev/test sets contain mainly casual
             pictures of people, places, landmarks, animals. Suppose you also have a large collection of
             scanned historical documents:


















             These documents don’t contain anything resembling a cat. They also look completely unlike
             your dev/test distribution. There is no point including this data as negative examples,
             because the benefit from the first effect above is negligible—there is almost nothing your
             neural network can learn from this data that it can apply to your dev/test set distribution.
             Including them would waste computation resources and representation capacity of the
             neural network.



             12  ​A Study in Scarlet​ ​by Arthur Conan Doyle


             Page 74                            Machine Learning Yearning-Draft                       Andrew Ng
   69   70   71   72   73   74   75   76   77   78   79