Page 272 - Computational Statistics Handbook with MATLAB

P. 272

Chapter 8: Probability Density Estimation 261

MIAE =E ∫ ˆ f x() – fx() x . (8.4)
d

These concepts are easily extended to the multivariate case.

8.2 Histograms

Histograms were introduced in Chapter 5 as a graphical way of summarizing
or describing a data set. A histogram visually conveys how a data set is dis-
tributed, reveals modes and bumps, and provides information about relative
frequencies of observations. Histograms are easy to create and are computa-
tionally feasible. Thus, they are well suited for summarizing large data sets.
We revisit histograms here and examine optimal bin widths and where to
start the bins. We also offer several extensions of the histogram, such as the
frequency polygon and the averaged shifted histogram.

D
raamms
s
rr aamm ss
r
1-
11-- DHistogHistog
1-DHistogDHistog
Most introductory statistics textbooks expose students to the frequency his-
togram and the relative frequency histogram. The problem with these is that
the total area represented by the bins does not sum to 1. Thus, these are not
valid probability density estimates. The reader is referred to Chapter 5 for
more information on this and an example illustrating the difference between
a frequency histogram and a density histogram. Since our goal is to estimate
a bona fide probability density, we want to have a function f x() that is nonne-
ˆ
gative and satisfies the constraint that
∫ ˆ f x() x = . 1 (8.5)
d
, , , . The ana-
The histogram is calculated using a random sample X 1 X 2 … X n
lyst must choose an origin for the bins and a bin width h. These two param-
t 0
eters define the mesh over which the histogram is constructed. In what
follows, we will see that it is the bin width that determines the smoothness of
the histogram. Small values of h produce histograms with a lot of variation,
while larger bin widths yield smoother histograms. This phenomenon is
illustrated in Figure 8.1, where we show histograms with different bin
widths. For this reason, the bin width h is sometimes referred to as the
smoothing parameter.
,
Let B k = [t k t k + ) denote the k-th bin, where t k + – t k = h , for all k. We rep-
1
1
. The 1-D
resent the number of observations that fall into the k-th bin by ν k
histogram at a point x is defined as
© 2002 by Chapman & Hall/CRC

267 268 269 270 271 272 273 274 275 276 277