Page 50 - MATLAB Recipes for Earth Sciences

P. 50

3.4 Theoretical Distributions 41

of around 17 wt% and is stored in the ﬁ le

sodium = load('sodiumcontent_two.txt');
This data set contains only 50 measurements in order to better illustrate the
effect of an outlier. We can use the script used in the previous example to
display the data in a histogram and compute the number of observations n
with respect to the classes v. The mean of the data is 16.6379, the media is
16.9739 and the mode is 17.2109. Now we introduce one single value of 1.5
wt% in addition to the 50 measurements contained in the original data set.

sodium(51,1) = 1.5;

The histogram of this data set illustrates the distortion of the frequency dis-
tribution by this single outlier. The corresponding histogram shows several
empty classes. The inﬂuence of this outlier on the sample statistics is sub-

stantial. Whereas the median of 16.9722 is relatively unaffected, the mode
of 170558 is slightly different since the classes have changed. The most
signiﬁcant changes are observed in the mean (16.3411), which is very sensi-

tive to outliers.

3.4 Theoretical Distributions

Now we have described the empirical frequency distribution of our sample.
A histogram is a convenient way to picture the probability distribution of the
variable x. If we sample the variable sufﬁciently often and the output ranges

are narrow, we obtain a very smooth version of the histogram. An inﬁ nite

number of measurements N| and an inﬁnite small class width produces
the random variable·s probability density function (PDF). The probability
distribution density f(x) deﬁnes the probability that the variate has the value

equal to x. The integral of f(x) is normalized to unity, i.e., the total number
of observations is one. The cumulative distribution function (CDF) is the
sum of a discrete PDF or the integral of a continuous PDF. The cumulative
distribution function F(x) is the probability that the variable takes a value
less than or equal x.

As a next step, we have to ﬁnd a suitable theoretical distribution that

ﬁts the empirical distributions described in the previous chapters. In this
section, the most frequent theoretical distributions are introduced and their
application is described.

45 46 47 48 49 50 51 52 53 54 55