Page 349 - From Smart Grid to Internet of Energy
P. 349

Big data, privacy and security in smart grids Chapter  8 313


                Although the challenges and technical deficiencies, decision-making
             methods based on data inheritance are widely accepted by authorities. The data
             is a strategic component generated by technical and natural resources. The
             collected data are not usually ready for processing and defined as raw-data that
             are required to be located, identified, understood, and prepared for effective
             processing. At first step, data integration and cleaning are required to convert
             inherited raw data for storage. The big data differs from conventional data
             management systems due to their heterogenous formats such as structured,
             unstructured and semi-structured data sets as seen in Fig. 8.2. It is noted by
             reports that nearly 85% of inherited data are semi-structured or unstructured that
             are treated by nonrelational analytic technologies such as MapReduce or
             Hadoop. The three Vs which are volume, velocity, and variety are very impor-
             tant among others for data analytics in big data. The data volume is enormously
             growing year by year and it is expected to reach up to 40 Zeta bytes (ZB) until
             2020. Therefore, the velocity of data acquisition and processing should be as
             fast as volume of growing data size. On the other hand, the variety of data is
             another interest of big data researches since the data types and databases are
             differing in terms of structured or unstructured, public or private, shared or
             confidential types [6, 7].
                In addition to challenges in data acquisition, big data applications bring
             several problems on generating correct metadata which is related with proces-
             sing the acquired and stored data. The data analysis challenges are tackled by
             using sophisticated data mining techniques that provide to discover integrated,
             meaningful, clear and accessible data stacks. The gradually increased data sizes
             and volumes force researchers to improve computational methods for efficient
             data processing processes. The big data analytics require some efforts such as
             integration of massive data types with data correlation procedures, reliable and
             rapid processing models, real time processing and sampling capabilities of
             processors, and interactive user interfaces for managing the data processing
             ecosystem. The data processing operations are based on utilization of linear
             equation solvers, optimization algorithms, linear and nonlinear prediction pro-
             cedures such as Wiener and Kalman filters, canonical correlation analysis,
             linear discriminant analysis, and adaptive sampling processes such as belief
             propagation, sensing, and k-nearest neighbor algorithms [6]. The stages of
             big data processes are presented in the following sections according to
             data generation, data acquisition and storage, machine learning methods, and
             Internet of Things (IoT) applications in big data ecosystem.

             8.2.1  Big data generation

             The data generation is preliminary step of big data operations. The critical appli-
             cations, measurement and control devices, ICT interfaces used in smart grid and
             smart sensors generate the highest share of big data in smart grid applications.
             The big data of smart grid is a combination of all the inherited data from smart
   344   345   346   347   348   349   350   351   352   353   354