Page 359 - From Smart Grid to Internet of Energy
P. 359

Big data, privacy and security in smart grids Chapter  8 323



















             FIG. 8.6 A sample noise filtering with HTE-BD [19].


               Algorithm 8.2: HTE-BD Algorithm
                  1: Input: data a RDD of tuples (label, features)
                  2: Input: P the number of partitions
                  3: Input: nTrees the number of trees for Random Forest
                  4: Input: vote the voting strategy (majority or consensus)
                  5: Output: the filtered RDD without noise
                  6: partitions   kFold(data, P)
                  7: filteredData   ∅
                  8: for all train, test in partitions do
                  9:   classifiersModel   learnClassi f iers(train, nTrees)
                  10:  predictions   predict(classifiersModel, test)
                  11:  joinedData   join(zipWithIndex(predictions), zipWithIndex(test))
                  12:  markedData
                  13:  map r f, lr, knn, orig 2 joinedData
                  14:      count   0
                  15:      if rf 6¼ label(orig) then count   count +1 end if
                  16:      if lr 6¼ label(orig) then count   count +1 end if
                  17:      if knn 6¼ label(orig) then count   count +1 end if
                  18:      if vote ¼ ma jority then
                  19:         if count   2 then (label ¼ ∅, features(orig)) end if
                  20:         if count < 2 then orig end if
                  21:      else
                  22:         if count ¼ 3 then (label ¼ ∅, features(orig)) end if
                  23:         if count 6¼3 then orig end if
                  24:      end if
                  25:   end map
                  26: filteredData   union( filteredData, markedData)
                  27: end for
                  28: return( f ilter( f ilteredData, label 6¼ ∅))
   354   355   356   357   358   359   360   361   362   363   364