Page 234 - Data Architecture
P. 234

Chapter 6.2: Introduction to Data Vault Modeling
           and RDBMS engines on demand. It is not suggested that it will be fast, but rather that it
           can be easily accomplished.


           Deeper analysis of this subject is covered in Data Vault 2.0 boot camp training courses
           and in Data Vault 2.0 published materials. It is beyond the scope of this book to dive
           deeper into this subject.



           Business Keys


           Business keys have been around for a long time, if there have been data in operational
           applications. Business keys should be smart or intelligent keys and should be mapped to
           business concepts. That said, most business keys today are source system surrogate IDs,
           and they exhibit the same problems that sequences mentioned above exhibit.


           A smart or intelligent key is generally defined as a sum of components where digits or
           pieces of a single field contain meaning to the business. At Lockheed Martin, for
           example, a part number consisted of several pieces (it was a superkey of sorts). The part

           key included the make, model, revision, and year of the part, like a vehicle identification
           number (VIN) found on automobiles today.


           The benefits of a smart or intelligent key stretch far beyond the simple surrogate or
           sequence business key. These business keys usually exhibit the following positive
           behavior at the business level:


               • They hold the same value for the life of the data set.
               • They do not change when the data are transferred between and across business OLTP applications.
               • They are not editable by business (most of the time) in the source system application.
               • They can be considered master data keys.
               • They cross business processes and provide ultimate data traceability.
               • Largest benefit can allow parallel loading (like hashes) and also work as keys for geographically
               distributed data sets—without needing recomputation or lookups.


           They do have three downfalls: (a) length, generally, smart business keys can be longer
           than 40 characters; (b) meaning over time, the base definition can change every 5–15
           years or so (just look at how VIN number has evolved over the last 100 years); (c)
           sometimes, source applications CAN change the business keys, which wreaks havoc on
           any of the analytics that need to be done.


           If given the choice between surrogate sequences, hashes, and natural business keys,
           natural business keys would be the preference. The original definition (even today) states

                                                                                                               234
   229   230   231   232   233   234   235   236   237   238   239