Page 117 - Building Big Data Applications
P. 117

114   Building Big Data Applications


             unstructured data processing. The new approach to visualization can be defined as the
             following:
               Data discovery visualizationdthe viewpoints of touching and seeing the raw data
                first, prior to deciding what is needed. In this stage, we are not looking at concep-
                tual models or logical models of data, we are looking at identifying the data and its
                characteristics, its completeness, its formats, its variances, and all the different
                sources delivering similar data. The exercise will provide users the ability to discover
                data, align notes as comments, understand the business value, document opera-
                tional analytics to be delivered from the data, provide data stewards guidance for
                the usage of the data in the data lake, data hubs, analytics, reporting, and visualiza-
                tion. This exercise needs to be delivered by a team of business experts on data, data
                analysts, and the data steward team providing the governance required for the
                discovery and exploration. The deliverables from this exercise will be a set of data
                formats, dictionaries, and metadata, and no business rules are applied at operational
                level. The conceptual architecture is shown in Fig. 6.1,below.
               Operational data analytics and visualizationdincludes the different perspectives of
                how data can be visualized and analyzed at an operational level. The visualization
                here is not applying complex business rules; it is about looking inside the data and
                studying the behaviors associated with the data. If you ever wanted to perform
                complex event processing (CEP), this is the data layer at which you can initialize
                the discovery, align the behaviors as you start executing the visualization, and tag
                the data as you see the specifics that you want to infer insights at the data lake,
                data hub and analytics layers. This is the stage where we will ask for a detail
                document to be developed for the data, its formats, its integration points, and the




                           Streaming Data                      Alerts
                                                               Near-Real-Time Monitoring
                   Devices &
                   Sensors                        Apache Spark
                                                                     Self-Service
                  Social Media
                                                Machine Learning    Reports & Models
                                                                     Data Science
                                 Operational
                                  Data Store                       Advanced Analy cs
                                                    SAS / R-Studio
                                Data Swamp
                                & Data Lake
                                                                     Operational
                                                                     Historical  Repor ng Tool
                                                                     Analytical  of Choice
                                                                     Repor ng
                               Master                RDBMS
                               Data
                 SCM, ERP, Fin,
                 Files, Photos,
                 Videos
                             FIGURE 6.1 Conceptual data discovery and visualization architecture.
   112   113   114   115   116   117   118   119   120   121   122