Page 279 -
P. 279
6.4 Structural Matchinn 267
symbols, and the first column with the symbols of x, with the seed element D(0, 0)
= 0 representing a situation where no edit operations have been performed. From
this seed element, the algorithm proceeds to fill up the matrix by examining the
following three predecessors of D(i, j) (see Figure 6.17):
1. D(i, j-1): Progressing from D(i, j-1) to D(i, j) means the insertion of the
corresponding top symbol of y into x, with a cost of Dl = D(i, j- 1) + c(3L + a).
2. D(i-1, j): Progressing from D(i-1, j) to D(ij) means the deletion of the
corresponding left symbol of x, with a cost of 02 = D(i-1, j) + c(a + A).
3. D(i-1, j-1): Progressing from D(i-1, j-1) to D(i, j) means the substitution of the
corresponding left symbol of x by the corresponding top symbol of y, with a
cost of 03 = D(i-1, j-1) + c(a -+ b). If both symbols are equal, the cost of this
operation is zero.
The minimum of Dl, 02 and 03 is assigned to D(i, j).
Insert
Y
Figure 6.17. Computation of the Levenshtein distance between x and y, using
unitary costs for all elementary edit operations. The dotted line corresponds to the
minimum cost path.
Let us use Figure 6.17 to exemplify this procedure, assuming, for simplicity,
unitary costs for all edit operations. Applying rules 1 and 2 immediately fills up the
top row and left column. Now consider D(1, 1). It can be reached from D(O, 0),
D(0, I) and D(1, O), adding a cost of one to any of these transitions. If we now
consider D(3, 4), we find a minimum cost for the substitution, since both symbols
are equal, therefore we obtain D(3,4) = D(2,3).
As shown in Figure 6.17 the minimum cost of transforming x into y is 2,
representing the two previously mentioned edit operations as the first alternative to
transforming x into y. The path corresponding to this sequence of operations is also
shown in Figure 6.17.
In order for the Levenshtein distance to have the properties of a metric, the
following non-stringent conditions must be satisfied: