Page 165 -
P. 165
Chapter 4 ■ Grey-Level Segmentation 139
50 would be 50% (as opposed to 0.5, which would be 0.5%). This method is
quite old, and is sometimes called the p-tile method.
Using the histogram to select a threshold is a very common theme in
thresholding. One observation frequently made is that when a threshold is
obvious, it occurs at the low point between two peaks in the histogram. If the
histogram has two peaks, then this selection for the threshold would appear to
be a good one. The problem of selecting a threshold automatically now consists
of two steps: locating the two peaks, and finding the low point between them.
Finding the first peak in the histogram is simple: it is the bin having the
largest value. However, the second largest value is probably in the bin right
next to the largest, rather than being the second peak. Because of this, locating
the second peak is harder than it appears at first. A simple trick that frequently
works well enough is to look for the second peak by multiplying the histogram
values by the square of the distance from the first peak. This gives preference
to peaks that are not close to the maximum. So, if the largest peak is at level j
in the histogram, select the second peak as:
2
max ((k − j) h[k])|(0 ≤ k ≤ 255) (EQ 4.3)
where h is the histogram, and there are 256 grey levels, 0..255. This method is
implemented by the program called twopeaks.c.
A better way to identify the peaks in the histogram is to observe that they
result from many observations of grey levels that should be approximately
the same except for small disturbances (noise). If the noise is presumed to
be normally distributed, the peaks in the histogram could be approximated
by Gaussian curves. Gaussians could be fit to the histogram, and the largest
two used as the major peaks, the threshold being between them. This is an
expensive proposition, with no promise of superior performance; we don’t
know how many Gaussians there are really, how near the means are to each
other, or their standard deviations (still, see Section 4.2.1).
4.1.1 Using Edge Pixels
An edge pixel must be near to the boundary between an object and the
background, or between two objects; that is why it is an edge pixel. As a result,
the levels of the edge pixels are likely to be more consistent. Because they
will sometimes be inside the object and sometimes be a little outside due to
sampling concerns, the histogram of the levels of the edge pixels will be more
regular than the overall histogram.
This idea was used to produce a thresholding method based on the digital
Laplacian, which is a non-directional edge-detection operator [Weszka, 1974].
The threshold is found by first computing the Laplacian of the input image.