Page 146 -
P. 146

3:16 Page 109
                                                                                    #27
                                                            2011/6/1
                         HAN
                               10-ch03-083-124-9780123814791
                                                                              3.4 Data Reduction  109














                                                          Cluster sample














                                                           Startified sample





















                     Figure 3.9 Sampling can be used for data reduction.


                                 a cluster. A reduced data representation can be obtained by applying, say, SRSWOR
                                 to the pages, resulting in a cluster sample of the tuples. Other clustering criteria con-
                                 veying rich semantics can also be explored. For example, in a spatial database, we
                                 may choose to define clusters geographically based on how closely different areas are
                                 located.
                                 Stratified sample: If D is divided into mutually disjoint parts called strata, a stratified
                                 sample of D is generated by obtaining an SRS at each stratum. This helps ensure a
   141   142   143   144   145   146   147   148   149   150   151