Page 291 - Geochemical Anomaly and Mineral Prospectivity Mapping in GIS
P. 291

294                                                             Chapter 8

             predictor variables used to generate the discriminant function. The values of S DL can then
             be used for mapping geo-objects (e.g., prospective areas) of interest.
                In any method of DA, there are five basic assumptions about the data of the predictor
             variables. Firstly, the total number of cases must be at least five times the number of
             predictor variables (Tabachnick and Fidell, 2007). The number of cases (locations) for
             each group D can be equal or unequal, but if they are unequal the number of cases in the
             smallest (or smaller) group  must be greater than the  number of  predictor variables.
             Secondly, the data of the predictor  variables for the cases of each  group represent
             samples from a multivariate normal distribution. This assumption is difficult to justify in
             mineral prospectivity mapping especially because the ‘deposit-type’ cases and, thus, the
             data of the predictor variables for most of these cases are likely not representative of
             samples derived from a multivariate normal distribution (see Fig. 8-6). Fortunately, DA
             is not seriously affected  by violations  of the normality assumption as long as  non-
             normality is not due to outliers (Davis, 2002; Tabachnick and Fidell, 2007). Thirdly, the
             variance-covariance matrices of the  groups should  be equal, although inequality of
             variances is, like violation of normality, not ‘fatal’ to DA (Davis, 2002; Tabachnick and
             Fidell,  2007).  Fourthly, the predictor  variables are  neither completely redundant nor
             conditionally dependent (i.e., highly correlated) because, if they are, the matrix is said to
             be ill-conditioned and thus  cannot be inverted. Like the normality assumption, the
             assumption of conditional independence is difficult to justify for data of the predictor
             variables at deposit-type locations. Finally,  none of the cases used to derive the
             discriminant function are misclassified (i.e., none of the cases from one group belongs to
             another group).
                In the case study (see below), all the aforementioned five basic assumptions of DA,
             except the third basic assumption, are addressed as follows.  With respect to the first
             basic assumption of DA, the deposit-type locations, which are very few compared to the
             number of  predictor  variables (see below), are not used for training  but for testing.
             Instead,  proxy deposit-type  locations are  used for training. Two training sets, each
             consisting of equal numbers of proxy deposit-type locations and non-deposit locations,
             are used in LDA. With respect to the second basic assumption of DA, a one training set
             consisting of coherent proxy deposit-type locations (Fig. 8-8) is used in order to address
             the problem of non-normality due to outliers. In order to illustrate the utility of coherent
             proxy deposit-type locations in data-driven modeling of mineral prospectivity, another
             training set consisting of randomly-selected proxy deposit-type locations is used. With
             respect to the fourth basic assumption of DA, it is considered that the predictor variables
             at the coherent proxy deposit-type locations are not completely redundant because they
             are not completely coherent (see Fig. 8-7). With respect to the fifth basic assumption of
             DA, non-deposit locations that are highly dissimilar from the coherent proxy deposit-
             type locations (see Fig. 8-7) are used in the two training sets described above.
                There are two statistical tests of significance in DA (Tabachnick and Fidell, 2007).
             First, an F-test (Wilks’ lambda) is applied to test the null hypothesis that two groups
             under examination have identical multivariate means (i.e., if the discriminant model as a
   286   287   288   289   290   291   292   293   294   295   296