Page 183 - Artificial Intelligence in the Age of Neural Networks and Brain Computing

P. 183

3. From RNNs to Mouse-Level Computational Intelligence: Next Big Things 173

FIGURE 8.9
The time-lagged recurrent network (TLRN) [14,17].

Most of the way through the competition, Sven informed me that the students of
statistics, using 22 old well-established methods, were doing much better than the
teams from the other ﬁelds, despite enormous amounts of effort, cleverness, and
creativity.
When I heard that, I remembered how many students even in the neural network
ﬁeld were using simple popularized methods like naı ¨ve feedforward networks or
Vapnik methods or clever kludges, and simply had not studied the solid mathematics
of TLRNs. And so I mentioned the situation to friends in Ford Motor Research,
which was then the world’s leader in advanced practical neural network research
and applications (and had allowed its people like Feldkamp, Puskorius, Marko,
and Prokhorov to publish extensively). After many years of careful fair comparisons,
that group had settled on TLRNs trained by backpropagation through time for all its
mission-critical applications, like minimum cost compliance with stiff new EPA
Clean Air rules. They had developed practical procedures for “multistreaming” to
use their in-house general TLRN package effectively in safety-critical large general
applications. Without a lot of effort and with no tweaking, they input Crone’s chal-
lenge into their system, and quickly turned out the clear number one winner. Later, at
IJCNN 2011, Ford was a lead sponsor of the time-series competitiondand did not
enter, perhaps because the rules required full disclosure by all contestants, but
quietly veriﬁed that their system still outperformed all the competition.
It is important to understand why the standard, basic general TLRN has a ﬁrm
foundation in statistics [25]. Mathematicians such as Andrew Barron [26] and
Eduardo Sontag have proven that standard feedforward ANNs can approximate
smooth nonlinear functions much more accurately than other approximators like
Taylor series as the number of inputs grows, and that the universal approximation

178 179 180 181 182 183 184 185 186 187 188