Page 118 - Building Big Data Applications
P. 118
Chapter 6 Visualization, storyboarding and applications 115
tags to retain as we process it from here to data lake. All the alignment of the data,
the associated sources and the different fields and their attributions will be
completed as a part of this exercise. We have struggled prior to the advent of a
platform like Hadoop or Cassandra to deliver this at an enterprise scale. The
argument is not whether Oracle or Teradata could not have done it, they were
designed for a different purpose and the platform usage to the associated
applications can be delivered only when there is a perfect alignment of the data
and its usage.
Data lake visualizationdis the foundational step for enterprise usage of data for
storyboarding. The foundations of the storyboard are defined in the data discovery
and operational data analytics levels. The data is now loaded to an enterprise asset
category and is associated with the appropriate metadata, master data, social
media tags, and has integration points that will be used by different teams across
the enterprise by interpreting, integrating, and visualizing the data. The visualiza-
tion at this layer will resolve a lot of the foundation problems of business intelli-
gence. The often yet another business intelligence project moniker is gone.
However please understand that we need to create the technology integrations and
the data engineering needs to be done. The confusion in the marketplace is the
usage of the terminology data lake, which is not what a vendor can call their
solution to deliver. It has to be what is built from the data insights and delivery
layers. The teams involved in this data lake visualization exercise include data
analytics experts, data reporting specialists, analytical modelers, data modelers,
data architects, analytic visualization architects, machine learning experts, and
other documentation and analysts. This team will produce documents, data
models from conceptual, logical to physical, metadata, master data, and semantic
data elements as identified.
Fig. 6.2 shows the data from a volumetric perspective that we need to define for the
data lake creation. If this step is forgotten, please do not go beyond and ensure that it is
completed. Fig. 6.3 shows the analytics and its orientation from a visualization
perspective. This picture is essential to define for successful implementation of the data
lake and data hubs. The analytical layers will integrate data from the operational raw
data swamp all the way to data hubs, the quality of data is important as we drill-up in the
hierarchy.
Data hub visualizationdis a very fine-layered data representation. This is the
actual dashboard and analytics that will be used by executives and senior leaders
of any enterprise. The data layer here is aggregated and summarized for every
specific situation and the associated hierarchical layers are all defined, the drill-
down patterns identified and aligned and the appropriate granular layers are all
aligned. Today we can do this because we have visualized from the bottom
most layer of data discovery all the way to analytical layers. This kind of data ag-
gregation and integration is very unusual and has disrupted the way we do data