Page 66 -
P. 66

33 Why we compare to human-level

             performance



             Many machine learning systems aim to automate things that humans do well. Examples
             include image recognition, speech recognition, and email spam classification. Learning
             algorithms have also improved so much that we are now surpassing human-level
             performance on more and more of these tasks.


             Further, there are several reasons building an ML system is easier if you are trying to do a
             task that people can do well:

             1. Ease of obtaining data from human labelers.​ For example, since people recognize
             cat images well, it is straightforward for people to provide high accuracy labels for your
             learning algorithm.


             2. Error analysis can draw on human intuition.​ Suppose a speech recognition
             algorithm is doing worse than human-level recognition. Say it incorrectly transcribes an
             audio clip as “This recipe calls for a ​pear​ of apples,” mistaking “pair” for “pear.” You can
             draw on human intuition and try to understand what information a person uses to get the
             correct transcription, and use this knowledge to modify the learning algorithm.


             3. Use human-level performance to estimate the optimal error rate and also set
             a “desired error rate.”​ Suppose your algorithm achieves 10% error on a task, but a person
             achieves 2% error. Then we know that the optimal error rate is 2% or lower and the
             avoidable bias is at least 8%. Thus, you should try bias-reducing techniques.

             Even though item #3 might not sound important, I find that having a reasonable and

             achievable target error rate helps accelerate a team’s progress. Knowing your algorithm has
             high avoidable bias is incredibly valuable and opens up a menu of options to try.

             There are some tasks that even humans aren’t good at. For example, picking a book to
             recommend to you; or picking an ad to show a user on a website; or predicting the stock
             market. Computers already surpass the performance of most people on these tasks. With
             these applications, we run into the following problems:


             •   It is harder to obtain labels.​ For example, it’s hard for human labelers to annotate a
                 database of users with the “optimal” book recommendation. If you operate a website or
                 app that sells books, you can obtain data by showing books to users and seeing what they
                 buy. If you do not operate such a site, you need to find more creative ways to get data.




             Page 66                            Machine Learning Yearning-Draft                       Andrew Ng
   61   62   63   64   65   66   67   68   69   70   71