Page 360 - Introduction to Statistical Pattern Recognition
P. 360
342 Introduction to Statistical Pattern Recognition
Experiment 10: Estimation of the Parzen error, L and R
Data: RADAR (Real data, n = 66, E* = unknown)
Sample size: N I = N2 = 720, 360
No. of trials: T= 1
A
Kernel: A; = Zj for N,,,. = 8800 (Case 1)
A
A; = Zjk for N,,,, = 720, 360 (Case 2)
A
A; = Zj for N,,, = 720, 360 (Case 3)
Nc0, - No. of samples to estimate Z
Kernel size: I' = 9.0
Threshold: Option 4
Results: Table 7-3(b) [ 141
This experiment demonstrates more clearly the importance of the selection of
the kernel covariance. Note that even as the sample size used to estimate the
covariance matrices becomes small, the L error rates continue to provide rea-
sonable and consistent bounds in Case 2 of Table 7-3(b). This is in contrast to
the results given in Case 3 in which the estimated covariances are blindly used
without employing the L type covariance. As expected, the bounds become
worse as the sample sizes decrease.
Effect of m: Finally, in kernel selection, we need to decide which is
better, a normal or uniform kernel. More generally, we may address the selec-
tion of m in (6.3). The results using normal kernels (m = 1) are shown in Fig.
7-12, in which the upper bounds of the Bayes error are observed to be excel-
lent, but the lower bounds seem much too conservative. This tends to indicate
that the normal kernel function places too much weight on the sample being
tested in the R error estimate, Hence, one possible approach to improving the
lower bound of the Parzen estimate is to use a non-normal kernel function
which places less weight on the test sample and more weight on the neighbor-
ing samples. The uniform kernel function, with constant value inside a
specified region, is one such kernel function. However, if a uniform kernel
function is employed, one must decide which decision be made when the den-
sity estimates from the two classes are equal, making the Parzen procedure
even more complex. A smooth transition from a normal kernel to a uniform
kernel may be obtained by using the kernel function of (6.3) and changing m.
The parameter m determines the rate at which the kernel function drops off.
For m = 1, (6.3) reduces to a simple normal kernel. As m becomes large, (6.3)

