Page 89 - Algorithm Collections for Digital Signal Processing Applications using MATLAB
P. 89

2. Probability and Random Process                                 77

           3.       K-MEANS ALGORITHM FOR PATTERN
                    RECOGNITION


           If the groups of n-dimensional vectors are given, there is the need to classify
           the vectors into ‘k’ category. The centroid vectors of each group are used to
           represent the identity for ‘k’ category.  (i.e) If the new vector is to  be
           classified among the ‘k’ category, the distance between the new vector with
           all the centroids are computed. The centroid corresponding to the smallest
           distance is treated as the identified group. The centroids of all the groups are
           obtained using K-means algorithm as described below.
              Consider the set of normalized marks (i.e. the mark ranges from 0 to 1)
           scored by the 100 students in the class for particular subject. Each mark is
           treated as the vector with one-dimensional. There is the need to classify the
           collected marks into 6 groups for allocating 6 grades. This is done using k-
           means algorithm as described below.

           3.1      K-means Algorithm

           Step 1:  Initialize the 6 centroids randomly corresponding to 6 grades .
           Step 2:  Classify the marks and identify the grade for the individual marks
                   using the initialized 6 centroids as described in the section 3.
           Step 3:  Find the mean of the marks collected in the particular grade say ‘1’.
                   This is treated as the centroid corresponding to the grade 1 used for
                   the next iteration. This process is repeated to compute the new sets
                   of centroids corresponding to the individual grade.
           Step 4:  Go to step 2.
           Step 5:  The process involved from Step 2 to Step 4 is considered as the one
                   iteration and this process is repeated for finite number of iterations to
                   obtain the final centroids corresponding to each grades.

           3.2      Example

           The particular sets of marks (data) are subjected to k-means algorithm Final
           clusters along with clusters obtained at every iteration are displayed below.
                          th
           Note that after 6  iteration, clusters are not changed.
   84   85   86   87   88   89   90   91   92   93   94