Page 360 - From Smart Grid to Internet of Energy
P. 360

324  From smart grid to internet of energy


            The training data has been evaluated by three algorithms as voting strategies as
            seen in Fig. 8.6. The predicted tables and true labels are compared to determine
            noise levels that are filtered out to generate clean data. This filtering and clean-
            ing procedures are repeated until last fold reached.
               The recent progress has brought several different analysis approaches such
            as data mining, visualization, statistical methods, deep learning and machine
            learning. The machine learning methods facilitate to discover knowledge and
            intelligent decisions by using massive databases. It is analyzed in three catego-
            ries according to its learning bases as supervised, unsupervised, and reinforce-
            ment learning. The conventional data mining methods such as association
            mining, clustering and classification lack in terms of efficiency, and they are
            not able to provide scalable and accurate outcomes when they are applied to
            Big Data stacks. The size, speed and variety of data streams prevent conven-
            tional data mining methods to analyze data stacks permanently. Therefore,
            researchers have improved new optimization methods and analytical
            approaches for improving processing capability with limited resources.


            8.3.2 Machine learning in big data analytics
            The machine learning is a research area of computing science and an application
            area of artificial intelligence that is based on processing inductive models
            trained by limited data input. It is improved regarding to pattern recognition
            and computational learning systems. The input data provide patterns for learn-
            ing algorithm to define relationships among parameters of the database which is
            called as training set and samples. The learning categories of a machine learning
            system is comprised by three types of approaches as supervised, unsupervised,
            and reinforcement. The supervised learning taxonomy is based on predicting
            and output vector due to inherited knowledge from training set of input vectors
            and corresponding relations. The supervised learning methodology is based on
            classification and regression methods where classification denotes category
            variables while regression defines prediction of numerical variables. On the
            other hand, the unsupervised learning does not provide any training set and
            there is not any labeling required for predicting the variables. These learning
            structures are known as clustering algorithms or recommender systems. The
            reinforcement learning addresses learning problem for particular action or a
            set of actions to improve reliability of outcomes for a predefined situation.
            The most widely used machine learning algorithms and data processing
            methods are presented in Table 8.1 [20–22].
               The machine learning process is performed at a few steps by following data
            acquisition, preprocessing, selection, extraction, model selection, and valida-
            tion stages. Different datasets and inputs are combined at data acquisition
            and preprocessing steps while data cleaning is also performed at this stage.
            The predefined particular features are selected and extracted in the next step
            where it is followed by model selection step. All the selected and processed data
   355   356   357   358   359   360   361   362   363   364   365