Page 167 -
P. 167
5.2 Activation Functions 155
5.2 Activation Functions
We have mentioned previously that, in order to obtain more complex
discriminants, some type of non-linear function will have to be used as shown in
Figure 5.7. The non-linear function f is called an activation function. The
generalized decision function can now be written as:
d(x) = f (w' x) . (5-8)
Figure 5.7. Connectionist structure of a generalized decision function. The
processing unit (rectangle with dotted line) implementsAX).
Let us consider the one-dimensional two-class example shown in Figure 5.8a,
with three points -1, 0 and 1 and respective target values 1, - 1, 1. We now have a
class (target value 1) with disconnected regions (points -1 and 1). We try to
discriminate between the two classes by using the following parabolic activation
function:
In order to study the energy function in a simple way, we assume, as in the
previous example of Figure 5.2, that the parabolic activation function is directly
applied to the linear discriminant, as shown in equation (5-9a). From now on we
represent by an open circle the output neuron of the discriminant unit, containing
both the summation and the activation function, as in Figure 5.8b.
The energy function is symmetric and is partly represented in Figure 5.8~. The
rugged aspect of the surface is due solely to the discrete step of 0.5 used for the
weights a and 6. There are two global minima at (62, 0) and (-62, O), both
corresponding to the obvious solution d(x) = 2x2-1, passing exactly by the target