Page 39 - Artificial Intelligence for the Internet of Everything
P. 39

26    Artificial Intelligence for the Internet of Everything



             Algorithm 2.2 Basic SGD for Logistic Regression























          direction to be a random gradient evaluated at ξ k ¼ (X k , Y k ), which is sam-
          pled from the oracle:

                                  k            k
                             gðx k ,ξ Þ¼ rFðw k ,ξ Þ
                                     ¼r‘ðw k ;x k ,y k Þ
                                                                       (2.2)
                                                      !
                                                  T
                                                 e w x k
                                     ¼ x k y k      T
                                               1+ e w x k
          The implemented algorithm is then (Algorithm 2.2):
             Again, note that for each iteration only a single training point is evaluated.
          On the other hand the full-gradient method would have to use the entire
          dataset for every iteration.

          2.3.4 SGD Variants

          As mentioned in Section 2.1, the basic SGD algorithm has some room for
          improvement. In this section we introduce two popular SGD variants: mini-
          batch SGD and SGD with momentum. Each variant gives a different yet
          useful way of improving upon the basic SGD algorithm.

          2.3.4.1 Mini-Batch SGD
          One of the major issues with SGD is that its search directions have high var-
          iance. Instead of moving downhill as intended, the algorithm may wander
   34   35   36   37   38   39   40   41   42   43   44