Page 247 -
P. 247
5.13 Modular Neural Networks 235
In general, the idea behind modular neural networks is to profit from what each
neural net can do best, so that they co-operate towards the goal of attaining a high
classification performance.
The hierarchical and ensemble approaches, although often achieving very good
results, use the neural modules in a decoupled way, i.e., there are no mechanisms
for guiding the input feature vector to the most adequate module; also, the modules
are not trained in a co-operative way, so that each module is tuned to its specialized
recognition task taking into account what the other modules are doing. A
comparative survey of modular networks with a description of co-operative
mechanisms can be found in (Gasser A, Kame1 M, 1998).
5.14 Neural Networks in Data Mining
The purpose of data mining and the application of statistical classification in data
mining were presented in section 4.7. Neural networks play an important role in
data mining, namely the feature selection methods based on genetic algorithms,
Kohonen's self organising feature maps and multi-layer perceptrons. These are
used for classification or regression tasks, both called predictive modelling in data
mining jargon. The same requirements on algorithmic performance and evaluation
of solutions, presented in section 4.7, are applicable to neural network approaches.
Especially of interest in data mining applications are multi-layer perceptrons
solving complex regression/forecast tasks. In order to give a taste of such an
application to a typical data mining problem, and to discuss some important issues,
we will consider the problem of determining a useful predictive model for the
revenue of invested capital using the Firms dataset, which contains a table of
economic variables for 838 Portuguese firms (year 1995).
In order to build a predictive model for the capital revenue (variable CAPR),
defined as the ratio of the net income (NI) over the invested capital (CAP), we may
select as variables constituting the search space all those that bear no direct relation
with CAPR, namely GI (gross income), CA (capital plus assets), NW (number of
workers), P (apparent productivity), GIR (gross income revenue), A/C (assets
share) and DEPR (depreciations plus provisions), discarding the variables CAP and
NI, which are obviously not of interest here.
Performing feature selection with the genetic algorithm tool yielded variable
GIR as the only useful variable. This is a somewhat expected result given that
GIR=NI/GI and NI is directly related to CAPR. Running the Statistics intelligent
problem solver (IPS) for a quick search for an MLP solution, a reasonably
performing MLP1:l:l was found, using variable GIR as input and achieving a
0.765 correlation.
The search time is only about seven seconds on a 733 MHz Pentium. However,
with 838 cases we are still far from the typical bulk of a data warehouse! Also, the
quick search failed to find any alternative solutions to using GIR, and even failed
to see any contribution of the BRANCH variable, which we may rightly suspect of
having a definite influence on the results. As a matter of fact, by performing a
quick search only for the industrial firms (BRANCH=3, 500 cases), a better