Page 214 - Data Architecture
P. 214

Chapter 6.1: Introduction to Data Vault 2.0
           divided into parallel work streams. The model can be built incrementally over time with
           little to no reengineering efforts when change arrives. The model can be automatically
           generated (with human input around the concepts and business keys), to expedite and
           accelerate the process.



           A Technical View


           The data vault modeling is a hybrid approach based on third normal form and
           dimensional modeling aimed at the logical enterprise data warehouse. The data vault
           model is built as a ground-up, incremental, and modular models that can be applied to big
           data, structured, and unstructured data sets.


           DV2 modeling is focused on providing flexible, scalable patterns that work together to
           integrate raw data by business key for the enterprise data warehouse. DV2 modeling

           includes minor changes to ensure the modeling paradigms can work within the constructs
           of big data, unstructured data, multistructured data, and NoSQL.


           Data Vault Modeling 2.0 changes the sequence numbers to hash keys. The hash keys
           provide stability, parallel loading methods, and decoupled computation of parent key
           values for records. There is an alternative for engines that hash business key values
           internally—the option of utilizing the true business keys as they are, without sequences or
           hash surrogates. The pros and cons of each technique will be detailed in the data vault
           modeling section of this chapter.



           How Is Data Vault 2.0 Methodology Defined?



           A Business View


           The methodology utilizes best practices from software development best practices such

           as CMMI, Six Sigma, TQM, Lean Initiatives, and cycle time reduction and applies these
           notions for repeatability, consistency, automation, and error reduction.


           DV2 methodology focuses on rapid sprint cycles (iterations) with adaptations and
           optimizations for repeatable data warehousing tasks. The idea of DV2 methodology is to
           enable the team with agile data warehousing and business intelligence best practices.
           DV2 encompasses methodology as a pillar or key component to achieve the next level of
           maturity in the data warehousing platform.


                                                                                                               214
   209   210   211   212   213   214   215   216   217   218   219