Page 359 - From Smart Grid to Internet of Energy
P. 359
Big data, privacy and security in smart grids Chapter 8 323
FIG. 8.6 A sample noise filtering with HTE-BD [19].
Algorithm 8.2: HTE-BD Algorithm
1: Input: data a RDD of tuples (label, features)
2: Input: P the number of partitions
3: Input: nTrees the number of trees for Random Forest
4: Input: vote the voting strategy (majority or consensus)
5: Output: the filtered RDD without noise
6: partitions kFold(data, P)
7: filteredData ∅
8: for all train, test in partitions do
9: classifiersModel learnClassi f iers(train, nTrees)
10: predictions predict(classifiersModel, test)
11: joinedData join(zipWithIndex(predictions), zipWithIndex(test))
12: markedData
13: map r f, lr, knn, orig 2 joinedData
14: count 0
15: if rf 6¼ label(orig) then count count +1 end if
16: if lr 6¼ label(orig) then count count +1 end if
17: if knn 6¼ label(orig) then count count +1 end if
18: if vote ¼ ma jority then
19: if count 2 then (label ¼ ∅, features(orig)) end if
20: if count < 2 then orig end if
21: else
22: if count ¼ 3 then (label ¼ ∅, features(orig)) end if
23: if count 6¼3 then orig end if
24: end if
25: end map
26: filteredData union( filteredData, markedData)
27: end for
28: return( f ilter( f ilteredData, label 6¼ ∅))