Page 16 - Computational Statistics Handbook with MATLAB
P. 16

Chapter 1




                             Introduction










                             1.1 What Is Computational Statistics?
                             Obviously, computational statistics relates to the traditional discipline of sta-
                             tistics. So, before we define computational statistics proper, we need to get a
                             handle on what we mean by the field of statistics. At a most basic level, sta-
                             tistics is concerned with the transformation of raw data into knowledge
                             [Wegman, 1988].
                              When faced with an application requiring the analysis of raw data, any sci-
                             entist must address questions such as:

                                • What data should be collected to answer the questions in the anal-
                                   ysis?
                                • How much data should be collected?
                                • What conclusions can be drawn from the data?
                                • How far can those conclusions be trusted?

                             Statistics is concerned with the science of uncertainty and can help the scien-
                             tist deal with these questions. Many classical methods (regression, hypothe-
                             sis testing, parameter estimation, confidence intervals, etc.) of statistics
                             developed over the last century are familiar to scientists and are widely used
                             in many disciplines [Efron and Tibshirani, 1991].
                              Now, what do we mean by computational statistics? Here we again follow
                             the definition given in Wegman [1988]. Wegman defines computational sta-
                             tistics as a collection of techniques that have a strong “focus on the exploita-
                             tion of computing in the creation of new statistical methodology.”
                              Many of these methodologies became feasible after the development of
                             inexpensive computing hardware since the 1980’s. This computing revolu-
                             tion has enabled scientists and engineers to store and process massive
                             amounts of data. However, these data are typically collected without a clear
                             idea of what they will be used for in a study. For instance, in the practice of
                             data analysis today, we often collect the data and then we design a study to







                             © 2002 by Chapman & Hall/CRC
   11   12   13   14   15   16   17   18   19   20   21