Page 282 - Computational Statistics Handbook with MATLAB
P. 282

Chapter 8: Probability Density Estimation                       271


                             Example 8.3
                             Here we show how to create a frequency polygon using the Old Faithful
                             geyser data. We must first create the histogram from the data, where we use
                             the frequency polygon Normal Reference Rule to choose the smoothing
                             parameter.
                                load geyser
                                n = length(geyser);
                                % Use Normal Reference Rule for bin width
                                % of frequency polygon.
                                h = 2.15*sqrt(var(geyser))*n^(-1/5);
                                t0 = min(geyser)-1;
                                tm = max(geyser)+1;
                                bins = t0:h:tm;
                                vk = histc(geyser,bins);
                                vk(end) = [];
                                fhat = vk/(n*h);
                             We then use the MATLAB function called interp1 to interpolate between
                             the bin centers. This function takes three arguments (and an optional fourth
                             argument). The first two arguments to interp1 are the xdata and ydata
                             vectors that contain the observed data. In our case, these are the bin centers
                             and the bin heights from the density histogram. The third argument is a vec-
                             tor of xinterp values for which we would like to obtain interpolated
                             yinterp values. There is an optional fourth argument that allows the user
                             to select the type of interpolation (linear, cubic, nearest and spline).
                             The default is linear, which is what we need for the frequency polygon. The
                             following code constructs the frequency polygon for the geyser data.
                                % For frequency polygon, get the bin centers,
                                % with empty bin center on each end.
                                bc2 = (t0-h/2):h:(tm+h/2);
                                binh = [0 fhat 0];
                                % Use linear interpolation between bin centers
                                % Get the interpolated values at x.
                                xinterp = linspace(min(bc2),max(bc2));
                                fp = interp1(bc2, binh, xinterp);
                             To see how this looks, we can plot the frequency polygon and underlying his-
                             togram, which is shown in Figure 8.4.

                                % To plot this, use bar with the bin centers
                                tm = max(bins);
                                bc = (t0+h/2):h:(tm-h/2);
                                bar(bc,fhat,1,'w')
                                hold on
                                plot(xinterp,fp)
                                hold off


                            © 2002 by Chapman & Hall/CRC
   277   278   279   280   281   282   283   284   285   286   287