Page 168 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 168

Chapter 6 Plant leaf disease classification based on feature selection  157




               2.3 Feature selection
                  Feature selection (FS) is a procedure commonly employed in
               machine learning to solve the high dimensionality problem.
               It selects a subset of essential features and removes irrelevant,
               noisy, and dismissed features for simpler and more concise
               data representation. In FS, a subset of features is selected
               from the original set of features based on features redundancy
               and relevance. Based on the relevance and redundant features,
               Yu et al. [8] in 2004 have classified the feature subset as four
               types. They are (1) noisy and irrelevant, (2) redundant and
               weakly relevant, (3) weakly relevant and nonredundant, and (4)
               strongly relevant. The feature that did not require for predicting
               accuracy is known as an irrelevant feature. Some of the popular
               approaches that fitinto filter and wrapper methods are models,
               search strategies, feature quality measures, and feature evalua-
               tion. Set of features are key factors for determining the hypothe-
               sis of the predicting models. The number of features and the
               hypothesis space are directly proportional to each other, i.e., as
               the number of features increases, the hypothesis space is also
               increased. For example, if there are M features with the binary
               class label in a data set, then it has 2 2 M  combinations in the
               search space.
                  FS methods are classified into three types, based on the inter-
               action with the learning model such as filter, wrapper, and
               embedded methods. In the filter method, features are selected
               based on statistical measures. It is independent of the learning
               algorithm and requires less computational time. Information
               gain, chi-square test [9], Fisher score, correlation coefficient,
               and variance threshold are some of the statistical measures
               used to understand the importance of the features. The perfor-
               mance of the Wrapper method depends on the classifier. The
               best subset of features is selected based on the results of the
               classifier. Wrapper methods are computationally more expen-
               sive than filter methods, due to the repeated learning steps and
               cross-validation. However, these methods are more accurate
               than the filter method. Some of the examples are recursive
               feature elimination [10], sequential FS algorithms [11], and
               genetic algorithms. The third approach is the embedded method
               that uses ensemble learning and hybrid learning methods for FS.
               Since it has a collective decision, its performance is better than
               the other two models. Random forest is one such example. It is
               computationally less intensive than wrapper methods. However,
               this method has a drawback of specific to a learning model.
   163   164   165   166   167   168   169   170   171   172   173