Page 207 -
P. 207
5.6 Performance of Neural Networks 195
dichotomies reaches the value 2" for a maximum of n=d+l patterns in accordance
with formula (5-5 1). Thus, for this case, dyc =d+l.
Below dyc the neural net learns the training set without error and does not
generalize. It is only above dyc that the neural net is capable of generalization.
In general, the Vapnik Chervonenkis dimension is, unfortunately, very difficult
to determine. Notice that in order to show that the VC dimension of a MLP is at
least m, one must simply find some set of m regularly distributed points shattered
by the MLP. However, in order to show that the VC dimension of a MLP is at most
m, one must show that no set of m+l regularly distributed points is shattered by the
MLP. Therefore, in the case of Figure 5.32, a lower dyC bound is 5. As a matter of
fact, a lower bound for the VC dimension of a two-class MLP, with hard limiters,
is given precisely by the maximum number of linearly separable regions, formula
(5-52).
Figure 5.33 shows the values of this lower bound for several values of d and h,
obtained with the PR Size program (see Appendix B).
Figure 5.33. Lower bound on the VC dimension for a two-class MLP with hard-
limited activation functions (logarithmic vertical scale).
Bounds on the training set size needed for MLP generalization depend on the
VC dimension. A lower bound on the number of training samples needed for
generalization is due to Blumer et al. (1989):
I
, d, (1 - 2@(1 -a)+ a)) ,
where Pe is the true error rate of the neural network, and I -a is the confidence
level that the design set estimate of the error rate will approximate with arbitrary