Page 166 - Building Big Data Applications
P. 166

Chapter 9   Governance 165


                   Fig. 9.4 shows the detailed processing of data across the different stages from source
                 systems to the data warehouse and downstream systems. When implemented with
                 metadata and master data integration the stages become self-contained and we can
                 manage the complexities of each stage within that stage’s scope of processing, as dis-
                 cussed next:
                   Acquire stagedIn this stage of data processing, we simply collect data from multi-
                   ple sources and this acquisition process can be implemented as direct extract from
                   a database to data being sent as flat files or simply available as a web service for
                   extraction and processing.
                     Metadata at this stage will include the control file (if provided), the extract file
                      name, size, and source system identification. All of this data can be collected as
                      a part of the audit process.
                     Master data at this stage has no role as it relates more to the content of the
                      data extracts in the processing stage.
                   Process StagedIn this stage of processing the data transformation and standardiza-
                   tion including applying data quality rules is completed and the data is prepared for
                   the loading into the data warehouse or data mart or analytical database. In this ex-
                   ercise both metadata and master data play very key roles.
                     Metadata is used in the data structures, rules, and data quality processing.
                     Master data is used for processing and standardizing the key business entities.
                     Metadata is used to process audit data.































                              FIGURE 9.4 Data processing cycles with integration of MDM and metadata.
   161   162   163   164   165   166   167   168   169   170   171