Page 322 -
P. 322
CH APT E R 10
Grouping and Model Fitting
In the previous chapter, we collected together pixels that “looked like” one another,
using various clustering methods and various ways of measuring similarity. This
view could be applied to tokens (such as points, edge points, superpixels). It is an
intrinsically local view.
An alternative, which emphasizes a more global view, is to collect together
pixels, tokens, or whatever because they conform to some model. This approach
appears rather similar to the clustering approach in intent, but the mechanisms
and outcomes tend to be quite different. In the clustering approach, the results we
produce can have local structure, but will not necessarily have a global structure.
For example, if we try to assemble tokens into a line agglomeratively by testing
whether the token to be added is close to the line formed by the previous two
tokens, we might get a shape that is quite curved. We really need to check whether
all tokens “agree” on the parameters of the line; local consistency is not enough.
These problems are rather difficult, and strategies to attack them for one kind
of model tend to extend rather well to strategies for other kinds of model. In this
chapter, we mainly concentrate on one core problem, which is simple enough to
do in detail. We seek to find all the lines represented by a set of tokens. This
problem is usually referred to as fitting, or sometimes as grouping. There are three
important sub-problems here: If all the tokens agree on a model, what is the model?
Which tokens contribute to a particular model, and which do not? And how many
instances of the model are there?
10.1 THE HOUGH TRANSFORM
Assume we wish to fit a structure to a set of tokens (say, a line to a set of points).
One way to cluster tokens that could lie on the same structure is to record all the
structures on which each token lies and then look for structures that get many votes.
This (quite general) technique is known as the Hough transform.To fit a structure
with a Hough transform, we take each image token and determine all structures
that could pass through that token. We make a record of this set—you should think
of this as voting—and repeat the process for each token. We decide on what is
present by looking at the votes. For example, if we are grouping points that lie on
lines, we take each point and vote for all lines that could go through it; we now
do this for each point. The line (or lines) that are present should make themselves
obvious because they pass through many points and so have many votes.
10.1.1 Fitting Lines with the Hough Transform
A line is easily parametrized as a collection of points (x, y) such that
x cos θ + y sin θ + r =0.
290