Page 164 - Building Big Data Applications
P. 164
Chapter 9 Governance 163
processes, the data governance and stewardship teams collectively determine the pol-
icies, validation and data quality rules, and service-level agreements for creating and
managing master data in the enterprise. These include the following:
Standardized definition of data common to all the systems and applications
Standardized definition of metadata.
Standardized definition of processes and rules for managing data
Standardized processes to escalate, prioritize, and resolve data processing issues.
Standardized process for acquiring, consolidating, quality processing, aggregating,
persisting, and distributing data across an enterprise.
Standardized interface management process for data exchange across the enter-
prise internally and externally
Standardized data security processes
Ensuring consistency and control in the ongoing maintenance and application use
of this information
Metadata about master data is a key attribute that is implemented in every style of
master data implementation. This helps resolve the business rules and processing
conflicts that are encountered by teams within organizations and help the data gover-
nance process manage the conflicts and resolve them in an agile approach.
Data management in big data infrastructure
With the world of big data there is a lot of ambiguity and uncertainty with data that
makes it complex to process, transform, and navigate. To make this processing simple
and agile, a data-driven architecture needs to be designed and implemented. This ar-
chitecture will be the blueprint of how business will explore the data in the big data side
and what they can possibly integrate with data within the RDBMS which will evolve to
become the analytical data warehouse. Data-driven architecture is not a new concept, it
has been used in business decision-making for ages, except for a fact that all the
touchpoint’s of data we are talking about in the current state are present in multiple silos
of infrastructure and not connected in any visualization, analytic, or reporting activity
today.
Fig. 9.2 shows the data touchpoint’s in an enterprise prior to the big data wave. For
each cycle of product and service from ideation to fulfillment and feedback, data was
created in the respective system and processed continuously. The flow of data is more of
a factory model of information processing. There are data elements that are common
across all the different business processes, which have evolved into the masterdata for
the enterprise, and then there is the rest of the data that needs to be analyzed for usage,
where the presence of metadata will be very helpful and accelerates the data investi-
gation and analysis. The downside of the process seen in Fig. 9.1 is the isolation of each
layer of the system resulting in duplication of data and incorrect attribution of the data
across the different systems.