Page 79 -
P. 79

41 Identifying Bias, Variance, and Data


             Mismatch Errors


             Suppose humans achieve almost perfect performance (≈0% error) on the cat detection task,
             and thus the optimal error rate is about 0%. Suppose you have:

             • 1% error on the training set.


             • 5% error on training dev set.

             • 5% error on the dev set.

             What does this tell you? Here, you know that you have high variance. The variance reduction
             techniques described earlier should allow you to make progress.


             Now, suppose your algorithm achieves:

             • 10% error on the training set.

             • 11% error on training dev set.

             • 12% error on the dev set.


             This tells you that you have high avoidable bias on the training set. I.e., the algorithm is
             doing poorly on the training set. Bias reduction techniques should help.

             In the two examples above, the algorithm suffered from only high avoidable bias or high
             variance. It is possible for an algorithm to suffer from any subset of high avoidable bias, high
             variance, and data mismatch. For example:


             • 10% error on the training set.

             • 11% error on training dev set.

             • 20% error on the dev set.


             This algorithm suffers from high avoidable bias and from data mismatch. It does not,
             however, suffer from high variance on the training set distribution.

             It might be easier to understand how the different types of errors relate to each other by
             drawing them as entries in a table:








             Page 79                            Machine Learning Yearning-Draft                       Andrew Ng
   74   75   76   77   78   79   80   81   82   83   84