Page 176 -
P. 176

164      5 Neural Networks


                            These are written in a 7x8 grid and their images binarised. From the binary images
                            the  horizontal  (H1  to  H8)  and  vertical (V1  to  V7)  projections  are  obtained by
                            counting the dark pixels, as shown in Figure 5.14.
                              Inspecting these projections for the "prototypes" U  and V  of  Figure 5.14, the
                            following features seem worth trying (other choices are possible):





                               Using  a  separable  set  of  US  and  V's  (set  1) the  perceptron  adjusts  a linear
                            discriminant until complete separation, as shown in  Figure 5.15. The Perceptron
                            program allows learning to be performed in apattern-by-pattern fashion, observing
                            the progress of the discriminant adjustment until convergence.
                               Using  a set of  non-separable U's  and  V's  (set 2), the  perceptron is  unable to
                            converge and  oscillates near the border of  the U's  and  V's  clusters. Figure 5.16
                            shows one of the best solutions obtained.
                               The simple type of decision surfaces that one can achieve with the perceptron is,
                            of  course,  one  of  its  limitations. Many  textbooks  illustrate  this  issue  with  the
                            classic  XOR  problem.  This  consists of  separating the  two-dimensional patterns
                            shown in Figure 5.17, whose target values correspond to the logical exclusive-or
                            (XOR) of the inputs xl and x2, coding the logical variables as: l=Tme, O=False.

















                            Figure 5.17.  The classic XOR  problem,  often used  to  illustrate neural  classifier
                            performance.



                               As the two classes of XOR patterns are not linearly separable, it is customary to
                            say that  it  is not  possible to solve this  problem with  a perceptron. However, we
                            must not forget that we may use transformed features as inputs. For instance, we
                            can use a quadratic transformation of the features. As seen in 2.1.1, we would then
                             need to compute (d+2)(d+1)/2=6 new features and use a perceptron with 6 weights:
   171   172   173   174   175   176   177   178   179   180   181