Page 16 - Computational Statistics Handbook with MATLAB

P. 16

Chapter 1

Introduction

1.1 What Is Computational Statistics?
Obviously, computational statistics relates to the traditional discipline of sta-
tistics. So, before we define computational statistics proper, we need to get a
handle on what we mean by the field of statistics. At a most basic level, sta-
tistics is concerned with the transformation of raw data into knowledge
[Wegman, 1988].
When faced with an application requiring the analysis of raw data, any sci-
entist must address questions such as:

• What data should be collected to answer the questions in the anal-
ysis?
• How much data should be collected?
• What conclusions can be drawn from the data?
• How far can those conclusions be trusted?

Statistics is concerned with the science of uncertainty and can help the scien-
tist deal with these questions. Many classical methods (regression, hypothe-
sis testing, parameter estimation, confidence intervals, etc.) of statistics
developed over the last century are familiar to scientists and are widely used
in many disciplines [Efron and Tibshirani, 1991].
Now, what do we mean by computational statistics? Here we again follow
the definition given in Wegman [1988]. Wegman defines computational sta-
tistics as a collection of techniques that have a strong “focus on the exploita-
tion of computing in the creation of new statistical methodology.”
Many of these methodologies became feasible after the development of
inexpensive computing hardware since the 1980’s. This computing revolu-
tion has enabled scientists and engineers to store and process massive
amounts of data. However, these data are typically collected without a clear
idea of what they will be used for in a study. For instance, in the practice of
data analysis today, we often collect the data and then we design a study to

11 12 13 14 15 16 17 18 19 20 21