Page 62 -
P. 62
3:12
2011/6/1
Page 25
#25
HAN 08-ch01-001-038-9780123814791
1.5 Which Technologies Are Used? 25
Unsupervised learning is essentially a synonym for clustering. The learning process
is unsupervised since the input examples are not class labeled. Typically, we may use
clustering to discover classes within the data. For example, an unsupervised learning
method can take, as input, a set of images of handwritten digits. Suppose that it finds
10 clusters of data. These clusters may correspond to the 10 distinct digits of 0 to
9, respectively. However, since the training data are not labeled, the learned model
cannot tell us the semantic meaning of the clusters found.
Semi-supervised learning is a class of machine learning techniques that make use
of both labeled and unlabeled examples when learning a model. In one approach,
labeled examples are used to learn class models and unlabeled examples are used to
refine the boundaries between classes. For a two-class problem, we can think of the
set of examples belonging to one class as the positive examples and those belonging
to the other class as the negative examples. In Figure 1.12, if we do not consider the
unlabeled examples, the dashed line is the decision boundary that best partitions
the positive examples from the negative examples. Using the unlabeled examples,
we can refine the decision boundary to the solid line. Moreover, we can detect that
the two positive examples at the top right corner, though labeled, are likely noise or
outliers.
Active learning is a machine learning approach that lets users play an active role
in the learning process. An active learning approach can ask a user (e.g., a domain
expert) to label an example, which may be from a set of unlabeled examples or
synthesized by the learning program. The goal is to optimize the model quality by
actively acquiring knowledge from human users, given a constraint on how many
examples they can be asked to label.
Noise/outliers
Positive example Decision boundary without unlabeled examples
Negative example Decision boundary with unlabeled examples
Unlabeled example
Figure 1.12 Semi-supervised learning.