Page 219 -
P. 219

11-ch04-125-186-9780123814791
                                                            2011/6/1
                         HAN
          182   Chapter 4 Data Warehousing and Online Analytical Processing  3:17 Page 182  #58



                         (a) Present an example illustrating such a huge and sparse data cube.
                         (b) Design an implementation method that can elegantly overcome this sparse matrix
                            problem. Note that you need to explain your data structures in detail and discuss
                            the space needed, as well as how to retrieve data from your structures.
                         (c) Modify your design in (b) to handle incremental data updates. Give the reasoning
                            behind your new design.
                     4.9 Regarding the computation of measures in a data cube:
                         (a) Enumerate three categories of measures, based on the kind of aggregate functions
                            used in computing a data cube.
                         (b) For a data cube with the three dimensions time, location, and item, which category
                            does the function variance belong to? Describe how to compute it if the cube is
                            partitioned into many chunks.
                                                                               2
                            Hint: The formula for computing variance is  1  P N  (x i − ¯x i ) , where ¯x i is the
                                                                  N   i=1
                            average of x i s.
                         (c) Suppose the function is “top 10 sales.” Discuss how to efficiently compute this
                            measure in a data cube.
                    4.10 Suppose a company wants to design a data warehouse to facilitate the analysis of moving
                         vehicles in an online analytical processing manner. The company registers huge amounts
                         of auto movement data in the format of (Auto ID, location, speed, time). Each Auto ID
                         represents a vehicle associated with information (e.g., vehicle category, driver category),
                         and each location may be associated with a street in a city. Assume that a street map is
                         available for the city.
                         (a) Design such a data warehouse to facilitate effective online analytical processing in
                            multidimensional space.
                         (b) The movement data may contain noise. Discuss how you would develop a method
                            to automatically discover data records that were likely erroneously registered in the
                            data repository.
                         (c) The movement data may be sparse. Discuss how you would develop a method that
                            constructs a reliable data warehouse despite the sparsity of data.
                         (d) If you want to drive from A to B starting at a particular time, discuss how a system
                            may use the data in this warehouse to work out a fast route.
                    4.11 Radio-frequency identification is commonly used to trace object movement and per-
                         form inventory control. An RFID reader can successfully read an RFID tag from
                         a limited distance at any scheduled time. Suppose a company wants to design a data
                         warehouse to facilitate the analysis of objects with RFID tags in an online analytical pro-
                         cessing manner. The company registers huge amounts of RFID data in the format of
                         (RFID, at location, time), and also has some information about the objects carrying the
                         RFID tag, for example, (RFID, product name, product category, producer, date produced,
                         price).
                         (a) Design a data warehouse to facilitate effective registration and online analytical
                            processing of such data.
   214   215   216   217   218   219   220   221   222   223   224