Page 56 -
P. 56

28 Diagnosing bias and variance: Learning

             curves




             We’ve seen some ways to estimate how much error can be attributed to avoidable bias vs.
             variance. We did so by estimating the optimal error rate and computing the algorithm’s

             training set and dev set errors. Let’s discuss a technique that is even more informative:
             plotting a learning curve.

             A learning curve plots your dev set error against the number of training examples. To plot it,
             you would run your algorithm using different training set sizes. For example, if you have
             1,000 examples, you might train separate copies of the algorithm on 100, 200, 300, …, 1000

             examples. Then you could plot how dev set error varies with the training set size. Here is an
             example:























             As the training set size increases, the dev set error should decrease.


             We will often have some “desired error rate” that we hope our learning algorithm will
             eventually achieve. For example:

             • If we hope for human-level performance, then the human error rate could be the “desired
               error rate.”

             • If our learning algorithm serves some product (such as delivering cat pictures), we might

               have an intuition about what level of performance is needed to give users a great
               experience.








             Page 56                            Machine Learning Yearning-Draft                       Andrew Ng
   51   52   53   54   55   56   57   58   59   60   61