Page 87 - Foundations of Cognitive Psychology : Core Readings

P. 87

86 Jay L. McClelland, David E.Rumelhart, and Geoffrey E.Hinton

spontaneous generalization, extending behavior appropriate for one pattern to
other similar patterns.This property is shared by other PDP models, such as
thewordperceptionmodel andthe Jets andSharksmodel described above;the
main difference here is in the existence of simple, local, learning mechanisms
that can allow the acquisition of the connection strengths needed to produce
these generalizations through experience with members of the ensemble of
patterns.Distributed models have another interesting property as well: If
there are regularities in the correspondences between pairs of patterns, the
model will naturally extract these regularities.This property allows distributed
models to acquire patterns of interconnections that lead them to behave in ways
we ordinarily take as evidence for the use of linguistic rules.
We describe one such model very brieﬂy.The model is a mechanism that
learns how to construct the past tenses of words from their root forms through
repeated presentations of examples of root forms paired with the correspond-
ing past-tense form.The model consists of two pools of units. In one pool, pat-
terns of activation representing the phonological structure of the root form of
the verb can be represented, and, in the other, patterns representing the pho-
nological structure of the past tense can be represented.The goal of the model
is simply to learn the right connection strengths between the root units and the
past-tense units, so that whenever the root form of a verb is presented the
model will construct the corresponding past-tense form.The model is trained
by presenting the root form of the verb as a pattern of activation over the root
units, and then using a simple, local, learning rule to adjust the connection
strengths so that this root form will tend to produce the correct pattern of acti-
vation over the past-tense units.The model is tested by simply presenting the
root form as a pattern of activation over the root units and examining the pat-
tern of activation produced over the past-tense units.
The model is trained initially with a small number of verbs children learn
early in the acquisition process.At this point in learning, it can only produce
appropriateoutputs for inputsthatithas explicitly been shown.Butasitlearns
more and more verbs, it exhibits two interesting behaviors.First, it produces
the standard ed past tense when tested with pseudo-verbs or verbs it has never
seen.Second, it ‘‘overregularizes’’ the past tense of irregular words it pre-
viously completed correctly.Often, the model will blend the irregular past
tense of the word with the regular ed ending, and produce errors like CAMED
as the past of COME.These phenomena mirror those observed in the early
phases of acquisition of control over past tenses in young children.
The generativity of the child’s responses—the creation of regular past tenses
of new verbs and the overregularization of the irregular verbs—has been taken
as strong evidence that the child has induced the rule which states that the
regular correspondence for the past tense in English is to add a ﬁnal ed (Berko,
1958).On the evidence of its performance, then, the model can be said to have
acquired the rule.However, no special rule-induction mechanism is used, and
no special language-acquisition device is required.The model learns to behave
in accordance with therule, notbyexplicitlynotingthatmostwords take ed
in the past tense in English and storing this rule away explicitly, but simply
by building up a set of connections in a pattern associator through a long series
of simple learning experiences.The same mechanisms of parallel distributed

82 83 84 85 86 87 88 89 90 91 92