Page 819 - Mechanical Engineers' Handbook (Volume 2)

P. 819

810 Neural Networks in Feedback Control Systems

T
where W [w ] is a matrix of output representative values and the FL basis functions ( )
kj
j
play the role of NN activation functions. Using product inferencing, the basis functions are
given in terms of the one-dimensional membership functions (MFs) (x,U )by
ij
ij
n
(x ,U )
i 1 ij i ij
(x,U)
j
L n
ij
ij
i
j 1
i 1 (x ,U )
where U is a vector of parameters of the MFs including the centroids and spreads. The
ij
number of rules is L. The standard choice for the MFs is triangle functions. However, other
choices have been used, including splines (c.f. Ref. 4, CMAC NN), second- or third-degree
polynomials, or the RBF functions. 36
FL systems have the connotation of higher level supervisors since they are rule based.
The fuzzy-neural reinforcement learning scheme shown in Fig. 16 has been developed, where
a FL system serves as a critic and a NN serves as an action-generating network that controls
the system. The reinforcement controller is adaptive in the sense that the FL critic is tuned
as well as the NN action-generating network to improve system performance through online
learning. Stability and convergence proofs have been provided and depend on using certain
specialized tuning schemes for the FL critic membership functions and the NN weights.
Tuning the membership functions has the effect of modifying them so they converge onto
the region in with highest state trajectory activity, a form of dynamic focusing of awareness.
The advantage of the FL/NN adaptive reinforcement learning structure is that the critic
can be initialized using linguistic/heuristic notions by the human user. Finally, for FL sys-
tems one can look at the ﬁnal MFs and interpret what information has been stored in the
system through learning.
9 OPTIMAL CONTROL USING NNs
Heretofore we have discussed the design of NN controllers for tracking and stabilization
based on control theory techniques including feedback linearization, backstepping, singular
perturbations, force control, dynamic inversion, and observer design. The point was made

r
e
si
Desired d
De
FL adaptive critic
FL
trajectory
ou
I Instantaneous s
nstantane
()
R(t) utility
Perf orma nce
Performance
aluator
v
e Evaluator
Tuning r(t)()
ˆ ˆ
(x)
f f(x ) u(t) ( x(t) ()
Unknown
Unknown
Plant
plant
d(t)()
Action-generating NN
Figure 16 Fuzzy logic adaptive reinforcement learning NN controller.

814 815 816 817 818 819 820 821 822 823 824