Page 103 - Statistics and Data Analysis in Geology

P. 103

Analysis of Sequences of Data

A - 0 0.11 0.30 0.16 0.43 - 1.00
B 0.76 0 0.13 0.11 0 1.00
from C 0.37 0.02 0 0.48 0.13 1.00
D 0.38 0.01 0.57 0 0.04 1.00
E - 0.40 0.34 0.13 0.13 0 1.00

The marginal probability vector is
0.30
A
D [E]
C
0.19
E 0.17

A x2 test, identical to Equation (4.2), can be used to check for the Markov
property in an embedded sequence. This is done by comparing the observed tran-
sition frequency matrix to the matrix expected if successive states are independent.
However, the fixed probability vector cannot be used to estimate the columns of the
expected transition probability matrix. This would result in the expectation of tran-
sitions from a state to itself, which are forbidden. Rather, we must use a somewhat
roundabout procedure to estimate the frequencies of transitions between indepen-
dent states, subject to the constraint that states cannot succeed themselves. We
begin by imagining that our sequence is actually a censored sample taken from
an ordinary succession in which transitions from a state to itself can occur. The
transition frequency matrix of this succession would look like the one we observe
except that the diagonal elements would contain values other than zero. If we were
to compute a transition probability matrix from this frequency matrix and then
raise it to an appropriately high power, it would estimate the transition probability
matrix of a sequence in which successive states were independent. If the diago-
nal elements were then discarded and the off-diagonal probabilities recalculated,
the result would be the expected transition probability matrix for an embedded
sequence whose states are independent.
How do we estimate the frequencies of transitions from each state to itself,
when this information is not available? We do this by trial-and-error, searching
for those values that, when inserted on the diagonal of the transition frequency
matrix, do not change when the matrix is powered. The off-diagonal elements,
however, will change until a stable configuration is reached, corresponding to the
independent events model.
In practice it is not necessary to calculate the off-diagonal probabilities at all.
We begin by assigning some arbitrarily large number, say 1000, to the diagonal
positions of the observed transition frequency matrix. The fixed probability vector
is found, by summing each row and dividing by the grand total, and then is used as
an estimate of the transition probabilities along the diagonal. These probabilities
are powered by squaring and multiplied by the grand total to obtain new estimates
of the diagonal frequencies. These new estimates are inserted into the original
transition frequency matrix and the process repeated. We can work through the
first cycle of the procedure.

175

98 99 100 101 102 103 104 105 106 107 108