Page 229 - Data Architecture
P. 229

Chapter 6.2: Introduction to Data Vault Modeling
           For example, let's say that in the past, for a sample customer, it was common to have one
           portfolio manager. Today, however, the company has changed the business rule so that
           there might be three or more portfolio managers assigned to a customer. If the data
           warehouse model enforces the “past” relationship (many customers to one portfolio

           manager), then to support today's relationship the data model and the ELT/ETL loading
           routines would have to be reengineered.


           There is a reason for having a many-to-many relationship implemented in a link table
           without the descriptive attributes attached. That reason would be to catch discrepancies
           across multiple source systems. A link table (for purposes of understanding) may be
           thought of as a relationship table. There are several forms of link structures including
           nonhistorized links, hierarchical links, and same-as links. These forms are functionally
           defined because they are defined in a manner that indicates the type of function or role
           the data in these structures play.


           Reengineering results in ever-increasing amounts of money because as the data set grows
           and the model grows, the time, complexity, and cost of modifications also grow.
           Eventually, this increase in cost and time to maintain the data outgrows the business’
           ability to pay.


           The only way to represent both relationships (historical and future) over time is to place
           the data in a many-to-many link table; then, based on query requirements provided by
           data marts downstream, the warehouse can tell the business users exactly what they have
           and when it “breaks” the current rule.



           Primary Key Options for Data Vault 2.0



           There are three main alternatives for selecting primary key values in a Data Vault 2.0
           model:


               • Sequence numbers
               • Hash keys
               • Business keys


           Sequence Numbers


           Sequence numbers have been around since the beginning of machines. They are system-
           generated, unique numeric values that are incremental (sequential) in nature. Sequence
           numbers have the following issues:
                                                                                                               229
   224   225   226   227   228   229   230   231   232   233   234