Page 11 -
P. 11

Here, by “Small NN” we mean a neural network with only a small number of hidden
             units/layers/parameters. Finally, if you train larger and larger neural networks, you can
                                               1
             obtain even better performance:























             Thus, you obtain the best performance when you (i) Train a very large neural network, so
             that you are on the green curve above; (ii) Have a huge amount of data.

             Many other details such as neural network architecture are also important, and there has
             been much innovation here. But one of the more reliable ways to improve an algorithm’s
             performance today is still to (i) train a bigger network and (ii) get more data.





             1  This diagram shows NNs doing better in the regime of small datasets. This effect is less consistent
             than the effect of NNs doing well in the regime of huge datasets. In the small data regime, depending
             on how the features are hand-engineered, traditional algorithms may or may not do better. For
             example, if you have 20 training examples, it might not matter much whether you use logistic
             regression or a neural network; the hand-engineering of features will have a bigger effect than the
             choice of algorithm. But if you have 1 million examples, I would favor the neural network.


             Page 11                            Machine Learning Yearning-Draft                       Andrew Ng
   6   7   8   9   10   11   12   13   14   15   16