Page 89 - Algorithm Collections for Digital Signal Processing Applications using MATLAB
P. 89
2. Probability and Random Process 77
3. K-MEANS ALGORITHM FOR PATTERN
RECOGNITION
If the groups of n-dimensional vectors are given, there is the need to classify
the vectors into ‘k’ category. The centroid vectors of each group are used to
represent the identity for ‘k’ category. (i.e) If the new vector is to be
classified among the ‘k’ category, the distance between the new vector with
all the centroids are computed. The centroid corresponding to the smallest
distance is treated as the identified group. The centroids of all the groups are
obtained using K-means algorithm as described below.
Consider the set of normalized marks (i.e. the mark ranges from 0 to 1)
scored by the 100 students in the class for particular subject. Each mark is
treated as the vector with one-dimensional. There is the need to classify the
collected marks into 6 groups for allocating 6 grades. This is done using k-
means algorithm as described below.
3.1 K-means Algorithm
Step 1: Initialize the 6 centroids randomly corresponding to 6 grades .
Step 2: Classify the marks and identify the grade for the individual marks
using the initialized 6 centroids as described in the section 3.
Step 3: Find the mean of the marks collected in the particular grade say ‘1’.
This is treated as the centroid corresponding to the grade 1 used for
the next iteration. This process is repeated to compute the new sets
of centroids corresponding to the individual grade.
Step 4: Go to step 2.
Step 5: The process involved from Step 2 to Step 4 is considered as the one
iteration and this process is repeated for finite number of iterations to
obtain the final centroids corresponding to each grades.
3.2 Example
The particular sets of marks (data) are subjected to k-means algorithm Final
clusters along with clusters obtained at every iteration are displayed below.
th
Note that after 6 iteration, clusters are not changed.