Page 356 - Data Architecture
P. 356

Chapter 9.1: Repetitive Analytics: Some Basics

           Bias of the Sample



           One issue that arises with sampling is the bias of the sample. When data are selected for
           inclusion in the sampling database, there is ALWAYS a bias of the data. What the bias is
           and how badly the bias colors the final analytic results are a function of the selection
           process. In some cases, there is a bias, but the bias of the data really doesn’t matter. In
           other cases, there is a real impact made on the final results because of the bias of the data
           selected for the sampling database.


           The analyst must constantly be aware of the existence of and the influence of the bias of
           the sampling data.


           Fig. 9.1.12 shows that there is an expensive marginal value of accuracy when processing
           sampling data.



















































                                                                                                               356
   351   352   353   354   355   356   357   358   359   360   361