Page 361 - From Smart Grid to Internet of Energy
P. 361
Big data, privacy and security in smart grids Chapter 8 325
TABLE 8.1 Mostly used machine learning algorithms in big data analytics
Algorithm type Data processing method
Naive bayes Classification
K-nearest neighbor Classification
Support vector machine (SVM) Classification
Linear regression Classification/regression
Support vector regression Classification/regression
Classification and regression trees Classification/regression
Random forest Classification/regression
Bagging Classification/regression
Artificial neural network Clustering/classification/regression
Feed forward neural network Clustering/classification/regression
K-means Clustering
Density based spatial clustering Clustering
are validated at the last step where classification, regression and evaluation of
processed data are performed. The classification and prediction are the most
important initial processing steps since they provide filtering, cleaning, valida-
tion, and model selection on input databases. The model selection of machine
learning algorithm enables to use learning datasets. The models include various
duties as classification, regression, detection, sampling, noise filtering, and
other solutions. The support vector machines (SVMs) and artificial neural net-
works (ANNs) are most widely used models in machine learning systems. The
conventional SVMs are binary classifiers which are used to find training sets
with maximum benefit among others. The binary classifier feature of SVM
is used to determine a hyperplane as a linear function of input data. Another
important feature is related to training points requirement where SVM needs
a few points that are called support vectors to classify next data points. SVMs
are accepted as the best supervised learning models due to their efficiency deal-
ing with high volume datasets by using limited memory resources. However,
SVM causes to drawbacks since it is not capable to provide direct probability
estimations [20, 22].
The ANN is based on processing of larger datasets, improved initialization
algorithms, robust learning models, and multilayered structure which is called
deep learning. The complex structure of ANNs that are formed by hidden layers
and intermediate layers is simplified by feedforward architectures that are