Page 39 -
P. 39

How about the Blackbox dev set? We previously said that dev sets of around 1,000-10,000
             examples are common. To refine that statement, a Blackbox dev set of 1,000-10,000
             examples will often give you enough data to tune hyperparameters and select among models,
             though there is little harm in having even more data. A Blackbox dev set of 100 would be
             small but still useful.


             If you have a small dev set, then you might not have enough data to split into Eyeball and
             Blackbox dev sets that are both large enough to serve their purposes. Instead, your entire dev
             set might have to be used as the Eyeball dev set—i.e., you would manually examine all the
             dev set data.

             Between the Eyeball and Blackbox dev sets, I consider the Eyeball dev set more important

             (assuming that you are working on a problem that humans can solve well and that examining
             the examples helps you gain insight). If you only have an Eyeball dev set, you can perform
             error analyses, model selection and hyperparameter tuning all on that set. The downside of
             having only an Eyeball dev set is that the risk of overfitting the dev set is greater.

             If you have plentiful access to data, then the size of the Eyeball dev set would be determined
             mainly by how many examples you have time to manually analyze. For example, I’ve rarely

             seen anyone manually analyze more than 1,000 errors.
























             Page 39                            Machine Learning Yearning-Draft                       Andrew Ng
   34   35   36   37   38   39   40   41   42   43   44