Page 56 - Artificial Intelligence for the Internet of Everything
P. 56

42    Artificial Intelligence for the Internet of Everything


          which led to a broad discussion on stochastic optimization. Next, we dis-
          cussed UQ in ML, specifically, how to develop ways for a model to know
          what it doesn’t know. In other words, we want the model to be especially cau-
          tious of data that is different from that on which it was trained. Section 2.5
          explored the recent emerging trends on adversarial learning, which is a new
          application of UQ in ML in IoBT in an offensive and defensive capacity.


          REFERENCES

          Athalye, A., Carlini, N., & Wagner, D. (2018). Obfuscated gradients give a false sense of security:
             Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420.
          Banerjee, A., Dunson, D. B., & Tokdar, S. T. (2012). Efficient Gaussian process regression
             for large datasets. Biometrika, 100(1), 75–89.
          Bauer, M., van der Wilk, M., & Rasmussen, C. E. (2016). Understanding probabilistic sparse
             Gaussian process approximations. In Advances in neural information processing systems
             (pp. 1533–1541).
          Bertsimas, D., & Tsitsiklis, J. (1993). Simulated annealing. Statistical Science, 8(1), 10–15.
          Bottou, L. (1998a). Online algorithms and stochastic approximations. D. Saad (Ed.), Online
             learning and neural networks. Cambridge: Cambridge University Press.
          Bottou, L. (1998b). Online learning and stochastic approximations. On-Line Learning in
             Neural Networks, 17(9), 142.
          Bottou, L., & Cun, Y. L. (2004). Large scale online learning. In Advances in neural information
             processing systems (pp. 217–224).
          Bottou, L., Curtis, F. E., & Nocedal, J. (2016). Optimization methods for large-scale machine
             learning. arXiv:1606.04838.
          Boyd, S., & Vanderberghe, L. (2004). Convex programming. New York: Wiley.
          Brooks, S., Gelman, A., Jones, G., & Meng, X. -L. (2011). Handbook of Markov chain Monte
             Carlo. London: CRC Press.
          Bui, T. D., Nguyen, C., & Turner, R. E. (2017). Streaming sparse Gaussian process approx-
             imations. In Advances in neural information processing systems (pp. 3301–3309).
          Byrd, R. H., Hansen, S. L., Nocedal, J., & Singer, Y. (2016). A stochastic quasi-Newton
             method for large-scale optimization. SIAM Journal on Optimization, 26(2), 1008–1031.
          Carlini, N., & Wagner, D. (2016). Towards evaluating the robustness of neural networks. arXiv
             preprint arXiv:1608.04644.
          Chaudhari, P., Baldassi, C., Zecchina, R., Soatto, S., Talwalkar, A., & Oberman, A. (2017).
             Parle: Parallelizing stochastic gradient descent. arXiv:1707.00424.
          Chen, T., Fox, E., & Guestrin, C. (2014). Stochastic gradient Hamiltonian Monte Carlo. In
             International conference on machine learning (pp. 1683–1691).
          Das, S., Roy, S., & Sambasivan, R. (2017). Dropout as a Bayesian approximation: Representing
             model uncertainty in deep learning. arXiv:1509.05142.
          Defazio, A., Bach, F., & Lacoste-Julien, S. (2014). Saga: A fast incremental gradient method
             with support for non-strongly convex composite objectives. In Advances in neural
             information processing systems (pp. 1646–1654).
          Deisenroth, M. P., Fox, D., & Rasmussen, C. E. (2015). Gaussian processes for data-efficient
             learning in robotics and control. IEEE Transactions on Pattern Analysis and Machine
             Intelligence, 37(2), 408–423.
          Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations
             over learned dictionaries. IEEE Transactions on Image Processing, 15(12), 3736–3745.
             https://doi.org/10.1109/TIP.2006.881969.
          Facchinei, F., Scutari, G., & Sagratella, S. (2015). Parallel selective algorithms for nonconvex
             big data optimization. IEEE Transactions on Signal Processing, 63(7), 1874–1889.
   51   52   53   54   55   56   57   58   59   60   61