Page 83 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 83
62 2 Presenting and Summarising the Data
2.3.2 Measures of Spread
The measures of spread (or dispersion) give an indication of how concentrated a
data distribution is. The most usual measures of spread are presented next.
Commands 2.8. SPSS, STATISTICA, MATLAB and R commands used to obtain
measures of spread and shape.
SPSS Analyze; Descriptive Statistics
STATISTICA Statistics; Basic Statistics/Tables;
Descriptive Statistics
MATLAB iqr(x) ; | range(x) ; std(x) ; var(x) ;
skewness(x) ; kurtosis(x)
R IQR(x) ; range(x) | sd(x) | var(x)|
skewness(x) ; kurtosis(x)
2.3.2.1 Range
The range of a dataset is the difference between its maximum and its minimum,
i.e.:
R = x max – x min. 2.10
The basic disadvantage of using the range as measure of spread is that it is
dependent on the extreme cases of the dataset. It also tends to increase with the
sample size, which is an additional disadvantage.
2.3.2.2 Inter-quartile range
The inter-quartile range is defined as (see also section 2.2.4):
IQR = x 0.75 − x 0.25 . 2.11
The IQR is less influenced than the range by outliers and extreme cases. It tends
also to be less influenced by the sample size (and can either increase or decrease).
2.3.2.3 Variance
The variance of a dataset x 1, …, x n (sample variance) is defined as:
n
v = ∑ n (x − ) x 2 /( − ) 1 . 2.12
1 = i i