Page 342 - Data Architecture
P. 342

Chapter 9.1: Repetitive Analytics: Some Basics
           Chapter 9.1



           Repetitive Analytics: Some Basics



           Abstract



           There are many facets to the analysis of repetitive data. One type of data where
           repetitive data are found is in an open-ended continuous system. Another place where
           repetitive analytics is done is in a project-based environment. A common practice for
           analytics in repetitive analytics is that of looking for patterns. One issue that always

           occurs with repetitive pattern analysis is the occurrence of false positives. A useful
           approach for doing repetitive analytics is to create what is known as the “sandbox.”
           Analysis in the sandbox does not go outside of the corporation. On the other hand, the
           analyst is not constrained with regard to the analysis that is done or what data can be
           analyzed. Log tapes often provide a basis for repetitive data analytics.


           Keywords



           Repetitive data; Open-ended continuous system; Project-based system; Pattern analysis;
           Outliers; False positives; The “sandbox”; Log tapes


           There are some basic concepts and practices regarding analytics that are pretty much
           universal. These practices and concepts apply to repetitive analytics and are essential for
           the data scientist.



           Different Kinds of Analysis



           There are two distinct types of analysis—open-ended continuous analysis and project-
           based analysis. Open-ended continuous analysis is analysis that is typically found in the
           structured corporate world but is occasionally found in the repetitive data world. In open-
           ended continuous analysis, the analysis starts with the gathering of data. Once the data
           are gathered, the next step is to refine the data and analyze the data. After the data are
           analyzed, someone's decision or a set of decisions are made, and the results of those
           decisions affect the world. Then, more raw data are gathered, and the process starts over
           again.


           The process of gathering data, refining it, analyzing it, and then making decisions based
                                                                                                               342
   337   338   339   340   341   342   343   344   345   346   347