Page 76 -
P. 76

39 Weighting data



             Suppose you have 200,000 images from the internet and 5,000 images from your mobile
             app users. There is a 40:1 ratio between the size of these datasets. In theory, so long as you
             build a huge neural network and train it long enough on all 205,000 images, there is no

             harm in trying to make the algorithm do well on both internet images and mobile images.

             But in practice, having 40x as many internet images as mobile app images might mean you
             need to spend 40x (or more) as much computational resources to model both, compared to if
             you trained on only the 5,000 images.


             If you don’t have huge computational resources, you could  give the internet images a much
             lower weight as a compromise.

             For example, suppose your optimization objective is squared error (This is not a good choice
             for a classification task, but it will simplify our explanation.) Thus, our learning algorithm
             tries to optimize:










             The first sum above is over the 5,000 mobile images, and the second sum is over the
             200,000 internet images. You can instead optimize with an additional parameter ​  ​:










              If you set ​  ​=1/40, the algorithm would give equal weight to the 5,000 mobile images and the
             200,000 internet images. You can also set the parameter ​  ​ to other values, perhaps by
             tuning to the dev set.

             By weighting the additional Internet images less, you don’t have to build as massive a neural

             network to make sure the algorithm does well on both types of tasks. This type of
             re-weighting is needed only when you suspect the additional data (Internet Images) has a
             very different distribution than the dev/test set, or if the additional data is much larger than
             the data that came from the same distribution as the dev/test set (mobile images).




             Page 76                            Machine Learning Yearning-Draft                       Andrew Ng
   71   72   73   74   75   76   77   78   79   80   81