Page 164 -
P. 164
138 Chapter 4 ■ Grey-Level Segmentation
The other class consists of those pixels that will become white:
I(i, j) ≥ T (EQ 4.2)
This assumption is only true in some real images because of noise and
illumination effects. It is not generally true that a single threshold can be
used to segment an image into objects and background regions, but it is true
in enough useful cases to be used as an initial assumption. For example,
documents scanned on any reasonable scanner these days can be thresholded
into text and background with one threshold.
The threshold must be determined from the pixel values found in the image.
Some measurement or set of measurements are made on the image, and from
these, and from known characteristics of the image, the threshold is computed.
One simple, but not especially good, example of this is the use of the mean
grey level in the image as a threshold. This would cause about half of the
pixels to become black and about half to become white. If this is appropriate,
it is an easy computation to perform. However, few images will be half black.
The program that thresholds an image in this way appears on the website,
and is named thrmean.c. It takes two arguments: the first is the image to be
thresholded, and the second is the name of the file in to which the thresholded
image will be written.
Although fixing the percentage of black pixels at 50% is not a good idea, there
are some image types that have a relatively fixed ratio of white to black pixels;
text images are a common example. On a given page of text having known
type styles and sizes the percentage of black pixels should be approximately
constant. For example, on a sample of ten pages from this book the percentage
of black pixels varied from 8.46% to 15.67%, with the smaller percentage being
due to the existence of some equations on that page. Therefore, a threshold
that would cause about 15% of the pixels to be black could be applied to this
sort of image with the expectation of reasonable success.
An easy way to find a threshold of this sort is by using the histogram of
the grey levels in the image. A histogram in this context is a vector having the
same number of dimensions as the image does grey levels. The value assigned
to each component (or bin) in the histogram h i is the number of pixels with
the grey level i. Obviously, the sum of all the components in the histogram
equals the number of pixels. Given a histogram and the percentage of black
pixels desired, we can determine the number of black pixels by multiplying
the percentage by the total number of pixels. Then simply count the pixels in
consecutive histogram bins, starting at bin 0, until the count is greater than or
equal to the desired number of black pixels. The threshold is the grey level
associated with last bin counted. This method appears on the website as the
program thrpct.c; the program asks for the percentage of black pixels, where