Page 49 -
P. 49

23 Addressing Bias and Variance




             Here is the simplest formula for addressing bias and variance issues:

             • If you have high avoidable bias, increase the size of your model (for example, increase the
               size of your neural network by adding layers/neurons).


             • If you have high variance, add data to your training set.

             If you are able to increase the neural network size and increase training data without limit, it
             is possible to do very well on many learning problems.

             In practice, increasing the size of your model will eventually cause you to run into
             computational problems because training very large models is slow. You might also exhaust

             your ability to acquire more training data. (Even on the internet, there is only a finite
             number of cat pictures!)

             Different model architectures—for example, different neural network architectures—will
             have different amounts of bias/variance for your problem. A lot of recent deep learning
             research has developed many innovative model architectures. So if you are using neural
             networks, the academic literature can be a great source of inspiration. There are also many

             great open-source implementations on github. But the results of trying new architectures are
             less predictable than the simple formula of increasing the model size and adding data.

             Increasing the model size generally reduces bias, but it might also increase variance and the
             risk of overfitting. However, this overfitting problem usually arises only when you are not
             using regularization. If you include a well-designed regularization method, then you can
             usually safely increase the size of the model without increasing overfitting.


             Suppose you are applying deep learning, with L2 regularization or dropout, with the
             regularization parameter that performs best on the dev set. If you increase the model size,
             usually your performance will stay the same or improve; it is unlikely to worsen significantly.
             The only reason to avoid using a bigger model is the increased computational cost.














             Page 49                            Machine Learning Yearning-Draft                       Andrew Ng
   44   45   46   47   48   49   50   51   52   53   54