Page 78 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 78

2.2 Presenting the Data   57


           histograms. A better alternative to comparing data distributions for several groups
           is to use the so-called box plot (or box-and-whiskers plot). As illustrated in Figure
           2.23, a box plot uses a distinct rectangular box for each group, where each box
           corresponds to the  central 50% of the  cases, the so-called  inter-quartile range
           (IQR). A central mark or line inside the box indicates the median, i.e., the value
           below which 50% of the cases are included. The boxes are prolonged with lines
           (whiskers) covering the range of the non-outlier cases, i.e., cases that do not
           exceed, by a certain factor of the IQR, the above or below box limits. A usual IQR
           factor for outliers is 1.5. Sometimes box plots also indicate, with an appropriate
           mark, the extreme cases, similarly defined as the outliers, but using a larger IQR
           factor,  usually 3. As an alternative to  using the central  50% range  of  the cases
           around the median, one can also use the mean ± standard deviation.
              There is also the possibility of obtaining categorised scatter plots or categorised
           3D plots. Their real usefulness is however questionable.



                                 ART
                            800


                            600


                            400


                            200

                                                            CL
                                    1         2         3
           Figure 2.23. Box plot of variable ART, obtained with R, for the three classes of
           the cork stoppers data. The “o” sign for Class 1 indicates an outlier, i.e., a case
           exceeding the top of the box by more than 1.5×IQR.

           Commands 2.6. SPSS, STATISTICA, MATLAB and R commands used to obtain
           box plots.

             SPSS          Graphs; Boxplot

             STATISTICA  Graphs; 2D Graphs; Boxplots

             MATLAB        boxplot(x)

             R             boxplot(x~y); legend(x,y,label)
   73   74   75   76   77   78   79   80   81   82   83