Page 18 - Artificial Intelligence in the Age of Neural Networks and Brain Computing
P. 18

3. Unsupervised Learning With Adaline, From the 1960s      5




                     The knobs of the potentiometers, seen in the photo, were manually rotated during
                  the training process in accordance with the LMS algorithm. The sum (SUM) was
                  displayed by the meter. Once trained, output decisions were þ1 if the meter reading
                  was positive, and  1 if the meter reading was negative.
                     The earliest learning experiments were done with this Adaline, training it as a
                  pattern classifier. This was supervised learning, as the desired response for each
                  input training pattern was given. A video showing Prof. Widrow training Adaline
                  can be seen online [https://www.youtube.com/watch?v¼skfNlwEbqck].



                  3. UNSUPERVISED LEARNING WITH ADALINE, FROM THE
                     1960s
                  In order to train Adaline, it is necessary to have a desired response for each input
                  training pattern. The desired response indicated the class of the pattern. But what
                  if one had only input patterns and did not know their desired responses, their classes?
                  Could learning still take place? If this were possible, this would be unsupervised
                  learning.
                     In 1960, unsupervised learning experiments were made with the Adaline of
                  Fig. 1.2 as follows. Initial conditions for the weights were randomly set and input
                  patterns were presented without desired responses. If the response to a given input
                  pattern was already positive (the meter reading to the right of zero), the desired
                  response was taken to be exactly þ1. A response of þ1 was indicated by a meter
                  reading half way on the right-hand side of the scale. If the response was less
                  than þ1, adaptation by LMS was performed to bring the response up toward þ1.
                  If the response was greater than þ1, adaptation was performed by LMS to bring
                  the response down toward þ1.
                     If the response to another input pattern was negative (meter reading to the left of
                  zero), the desired response was taken to be exactly  1 (meter reading half way on
                  the left-hand side of the scale). If the negative response was more positive than  1,
                  adaptation was performed to bring the response down toward  1. If the response
                  was more negative than  1, adaptation was performed to bring the response up
                  toward  1.
                     With adaptation taking place over many input patterns, some patterns that
                  initially responded as positive could ultimately reverse and give negative responses,
                  and vice versa. However, patterns that were initially responding as positive were
                  more likely to remain positive, and vice versa. When the process converges and
                  the responses stabilize, some responses would cluster about þ1 and the rest would
                  cluster about  1.
                     The objective was to achieve unsupervised learning with the analog responses at
                  the output of the summer (SUM) clustered at þ1or  1. Perfect clustering could be
                  achieved if the training patterns were linearly independent vectors whose number
                  were less than or equal to the number of weights. Otherwise, clustering to þ1
                  or  1 would be done as well as possible in the least squares sense. The result
   13   14   15   16   17   18   19   20   21   22   23