Page 30 -
P. 30

14 Error analysis: Look at dev set examples to

             evaluate ideas

























             When you play with your cat app, you notice several examples where it mistakes dogs for
             cats. Some dogs do look like cats!


             A team member proposes incorporating 3rd party software that will make the system do
             better on dog images. These changes will take a month, and the team member is
             enthusiastic. Should you ask them to go ahead?

             Before investing a month on this task, I recommend that you first estimate how much it will
             actually improve the system’s accuracy. Then you can more rationally decide if this is worth

             the month of development time, or if you’re better off using that time on other tasks.

             In detail, here’s what you can do:

             1. Gather a sample of 100 dev set examples that your system ​misclassified​. I.e., examples
                that your system made an error on.


             2. Look at these examples manually, and count what fraction of them are dog images.

             The process of looking at misclassified examples is called ​error analysis​. In this example, if
             you find that only 5% of the misclassified images are dogs, then no matter how much you
             improve your algorithm’s performance on dog images, you won’t get rid of more than 5% of
             your errors. In other words, 5% is a “ceiling” (meaning maximum possible amount) for how
             much the proposed project could help. Thus, if your overall system is currently 90% accurate

             (10% error), this improvement is likely to result in at best 90.5% accuracy (or 9.5% error,
             which is 5% less error than the original 10% error).





             Page 30                            Machine Learning Yearning-Draft                       Andrew Ng
   25   26   27   28   29   30   31   32   33   34   35