Page 57 -
P. 57
2.5 Featurc Assessment 43
Scatter plots are useful for gaining some insight into the topology of the classes
and clusters, especially for identifying features that are less correlated among them
and have more discriminative capability.
Figure 2.19 shows the scatter plot for the three classes of cork stoppers using
features ART and RAN, which, as we will see in a later section, are quite
discriminative and less correlated than others.
-2 ' I
-100 100 300 500 700 900 1100 A m3
ART
Figure 2.19. Scatter plot for the three classes of cork stoppers (features ART and
RAN).
2.5.2 Distribution Model Assessment
Some pattern classification approaches assume a distribution model of the patterns.
In these cases one has to assess whether the distributions of the feature vectors
comply reasonably with the model used. Also, the statistical inference tests that are
to be applied for assessing feature discriminative power may depend on whether
the distributions obey a certain model or not. The distribution model that is by far
the most popular is the Gaussian or normal model.
Statistical software products include tests on the acceptability of a given
distribution model. On this subject Statistics offers results of the Kolmogorov-
Smirnov (K-S) and the Shuj~iro-Wilk tests along with the representation of
histograms.
Figure 2.20 shows the histograms of features PRT and ARTG for class w,, with
overlaid normal curve (with same sample mean and variance) and results of
normality tests on the top.
As can be appreciated, feature PRT is well modelled by a normal distribution
(K-S p > 0.2), whereas for feature ARTG, the normality assumption has to be