Page 282 - Computational Statistics Handbook with MATLAB
P. 282
Chapter 8: Probability Density Estimation 271
Example 8.3
Here we show how to create a frequency polygon using the Old Faithful
geyser data. We must first create the histogram from the data, where we use
the frequency polygon Normal Reference Rule to choose the smoothing
parameter.
load geyser
n = length(geyser);
% Use Normal Reference Rule for bin width
% of frequency polygon.
h = 2.15*sqrt(var(geyser))*n^(-1/5);
t0 = min(geyser)-1;
tm = max(geyser)+1;
bins = t0:h:tm;
vk = histc(geyser,bins);
vk(end) = [];
fhat = vk/(n*h);
We then use the MATLAB function called interp1 to interpolate between
the bin centers. This function takes three arguments (and an optional fourth
argument). The first two arguments to interp1 are the xdata and ydata
vectors that contain the observed data. In our case, these are the bin centers
and the bin heights from the density histogram. The third argument is a vec-
tor of xinterp values for which we would like to obtain interpolated
yinterp values. There is an optional fourth argument that allows the user
to select the type of interpolation (linear, cubic, nearest and spline).
The default is linear, which is what we need for the frequency polygon. The
following code constructs the frequency polygon for the geyser data.
% For frequency polygon, get the bin centers,
% with empty bin center on each end.
bc2 = (t0-h/2):h:(tm+h/2);
binh = [0 fhat 0];
% Use linear interpolation between bin centers
% Get the interpolated values at x.
xinterp = linspace(min(bc2),max(bc2));
fp = interp1(bc2, binh, xinterp);
To see how this looks, we can plot the frequency polygon and underlying his-
togram, which is shown in Figure 8.4.
% To plot this, use bar with the bin centers
tm = max(bins);
bc = (t0+h/2):h:(tm-h/2);
bar(bc,fhat,1,'w')
hold on
plot(xinterp,fp)
hold off
© 2002 by Chapman & Hall/CRC