Page 40 -
P. 40
19 Takeaways: Basic error analysis
• When you start a new project, especially if it is in an area in which you are not an expert,
it is hard to correctly guess the most promising directions.
• So don’t start off trying to design and build the perfect system. Instead build and train a
basic system as quickly as possible—perhaps in a few days. Then use error analysis to
help you identify the most promising directions and iteratively improve your algorithm
from there.
• Carry out error analysis by manually examining ~100 dev set examples the algorithm
misclassifies and counting the major categories of errors. Use this information to
prioritize what types of errors to work on fixing.
• Consider splitting the dev set into an Eyeball dev set, which you will manually examine,
and a Blackbox dev set, which you will not manually examine. If performance on the
Eyeball dev set is much better than the Blackbox dev set, you have overfit the Eyeball dev
set and should consider acquiring more data for it.
• The Eyeball dev set should be big enough so that your algorithm misclassifies enough
examples for you to analyze. A Blackbox dev set of 1,000-10,000 examples is sufficient
for many applications.
• If your dev set is not big enough to split this way, just use an Eyeball dev set for manual
error analysis, model selection, and hyperparameter tuning.
Page 40 Machine Learning Yearning-Draft Andrew Ng