Page 365 - From Smart Grid to Internet of Energy
P. 365
Big data, privacy and security in smart grids Chapter 8 329
changing and increasing data stacks. The solution to processing massive data
stacks is based on multisource based mining mechanisms and data mining algo-
rithms that have been presented in the previous section. The most widely used
machine learning algorithms are noted as k-means, linear support vector
machines (LSVM), logistic regression (LR), locally weighted linear regression
(LWLR), Gaussian discriminant analysis (GDA), back-propagation neural net-
work (BPNN), expectation maximization (EM), naive Bayes (NB), and the
independent variable analysis (IVA) in smart energy applications [24]. Since
the big data analysis is in its evolution stage, applications face with several chal-
lenges and threats in terms of data storage, data integration, instant data proces-
sing, data compression and security issues. The threats and challenges are
presented in the following sections considering privacy, security, and related
issues.
8.4.1 Threats and challenges in privacy
The privacy requirements may vary depending to countries, legal approaches,
personal rights, and regulations. Privacy issues refer to protection of sensitive
data from intrusions and unauthorized accesses. It is noted that privacy can be
handled in four categories as physical, informational, decisional, and disposi-
tional [5]. The deployment of big data analytics has transformed privacy issue
to a core problem in data mining applications. Although the identification,
encryption and several other methods are used to enforce data privacy, ICT
based security risks may not completely eliminated [27].
The Big Data analytics can help to operation of distribution and transmission
system operators in terms of deployment and calibration of smart grid applica-
tions with different tools such as simulation and modeling studies. However, the
facilitated applications may cause to several challenges and systems can be
posed to several threats. It should be taken into consideration that networks
are not tolerant to threats and should be operated under heavy reliability and
security standards. Moreover, consumer privacy plays vital role in network
operation [25]. One of the prominent challenges of big data analytics in smart
grid is related with multisource data integration and storage issues. Despite the
conventional data analyses dealing with single planes, big data is a fusion of
multiple source-based data stacks with different formats and presentations.
Therefore, data storage and multiple source structure should be proven by
HDFS and similar systems. The rapid reactions such as fault detection and tran-
sient protection plays crucial role in utility networks. The cloud-based data stor-
age and processing systems can cause to latencies due to complicated and heavy
analysis algorithms. Thus, real time data processing is important requirement to
prevent latency threats against rapid reactions. These threats are mostly tackled
by using storing databases on rapid and local memories. Data compression is
another crucial solution for massive databases such as wide area monitoring
system (WAMS) [24].

