Page 270 - Computational Statistics Handbook with MATLAB

P. 270

Chapter 8

Probability Density Estimation

8.1 Introduction
We discussed several techniques for graphical exploratory data analysis in
Chapter 5. One purpose of these exploratory techniques is to obtain informa-
tion and insights about the distribution of the underlying population. For
instance, we would like to know if the distribution is multi-modal, skewed,
symmetric, etc. Another way to gain understanding about the distribution of
the data is to estimate the probability density function from the random sam-
ple, possibly using a nonparametric probability density estimation tech-
nique.
Estimating probability density functions is required in many areas of com-
putational statistics. One of these is in the modeling and simulation of phys-
ical phenomena. We often have measurements from our process, and we
would like to use those measurements to determine the probability distribu-
tion so we can generate random variables for a Monte Carlo simulation
(Chapter 6). Another application where probability density estimation is
used is in statistical pattern recognition (Chapter 9). In supervised learning,
which is one approach to pattern recognition, we have measurements where
each one is labeled with a class membership tag. We could use the measure-
ments for each class to estimate the class-conditional probability density
functions, which are then used in a Bayesian classifier. In other applications,
we might need to determine the probability that a random variable will fall
within some interval, so we would need to evaluate the cumulative distribu-
tion function. If we have an estimate of the probability density function, then
we can easily estimate the required probability by integrating under the esti-
mated curve. Finally, in Chapter 10, we show how to use density estimation
techniques for nonparametric regression.
In this chapter, we cover semi-parametric and nonparametric techniques
for probability density estimation. By these, we mean techniques where we
make few or no assumptions about what functional form the probability den-
sity takes. This is in contrast to a parametric method, where the density is
estimated by assuming a distribution and then estimating the parameters.

265 266 267 268 269 270 271 272 273 274 275