Page 29 -
P. 29

13 Build your first system quickly, then iterate




             You want to build a new email anti-spam system. Your team has several ideas:

             •   Collect a huge training set of spam email. For example, set up a “honeypot”: deliberately
                 send fake email addresses to known spammers, so that you can automatically harvest the
                 spam messages they send to those addresses.


             •   Develop features for understanding the text content of the email.

             •   Develop features for understanding the email envelope/header features to show what set
                 of internet servers the message went through.

             •   and more.


             Even though I have worked extensively on anti-spam, I would still have a hard time picking
             one of these directions. It is even harder if you are not an expert in the application area.

             So don’t start off trying to design and build the perfect system. Instead, build and train a
                                                                5
             basic system quickly—perhaps in just a few days.  Even if the basic system is far from the
             “best” system you can build, it is valuable to examine how the basic system functions: you
             will  quickly find clues that show you the most promising directions in which to invest your
             time. These next few chapters will show you how to read these clues.




























             5  This advice is meant for readers wanting to build AI applications, rather than those whose goal is to
             publish academic papers. I will later return to the topic of doing research.


             Page 29                            Machine Learning Yearning-Draft                       Andrew Ng
   24   25   26   27   28   29   30   31   32   33   34