Page 56 - Artificial Intelligence for the Internet of Everything

P. 56

42 Artificial Intelligence for the Internet of Everything

which led to a broad discussion on stochastic optimization. Next, we dis-
cussed UQ in ML, specifically, how to develop ways for a model to know
what it doesn’t know. In other words, we want the model to be especially cau-
tious of data that is different from that on which it was trained. Section 2.5
explored the recent emerging trends on adversarial learning, which is a new
application of UQ in ML in IoBT in an offensive and defensive capacity.

REFERENCES

Athalye, A., Carlini, N., & Wagner, D. (2018). Obfuscated gradients give a false sense of security:
Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420.
Banerjee, A., Dunson, D. B., & Tokdar, S. T. (2012). Efficient Gaussian process regression
for large datasets. Biometrika, 100(1), 75–89.
Bauer, M., van der Wilk, M., & Rasmussen, C. E. (2016). Understanding probabilistic sparse
Gaussian process approximations. In Advances in neural information processing systems
(pp. 1533–1541).
Bertsimas, D., & Tsitsiklis, J. (1993). Simulated annealing. Statistical Science, 8(1), 10–15.
Bottou, L. (1998a). Online algorithms and stochastic approximations. D. Saad (Ed.), Online
learning and neural networks. Cambridge: Cambridge University Press.
Bottou, L. (1998b). Online learning and stochastic approximations. On-Line Learning in
Neural Networks, 17(9), 142.
Bottou, L., & Cun, Y. L. (2004). Large scale online learning. In Advances in neural information
processing systems (pp. 217–224).
Bottou, L., Curtis, F. E., & Nocedal, J. (2016). Optimization methods for large-scale machine
learning. arXiv:1606.04838.
Boyd, S., & Vanderberghe, L. (2004). Convex programming. New York: Wiley.
Brooks, S., Gelman, A., Jones, G., & Meng, X. -L. (2011). Handbook of Markov chain Monte
Carlo. London: CRC Press.
Bui, T. D., Nguyen, C., & Turner, R. E. (2017). Streaming sparse Gaussian process approx-
imations. In Advances in neural information processing systems (pp. 3301–3309).
Byrd, R. H., Hansen, S. L., Nocedal, J., & Singer, Y. (2016). A stochastic quasi-Newton
method for large-scale optimization. SIAM Journal on Optimization, 26(2), 1008–1031.
Carlini, N., & Wagner, D. (2016). Towards evaluating the robustness of neural networks. arXiv
preprint arXiv:1608.04644.
Chaudhari, P., Baldassi, C., Zecchina, R., Soatto, S., Talwalkar, A., & Oberman, A. (2017).
Parle: Parallelizing stochastic gradient descent. arXiv:1707.00424.
Chen, T., Fox, E., & Guestrin, C. (2014). Stochastic gradient Hamiltonian Monte Carlo. In
International conference on machine learning (pp. 1683–1691).
Das, S., Roy, S., & Sambasivan, R. (2017). Dropout as a Bayesian approximation: Representing
model uncertainty in deep learning. arXiv:1509.05142.
Defazio, A., Bach, F., & Lacoste-Julien, S. (2014). Saga: A fast incremental gradient method
with support for non-strongly convex composite objectives. In Advances in neural
information processing systems (pp. 1646–1654).
Deisenroth, M. P., Fox, D., & Rasmussen, C. E. (2015). Gaussian processes for data-efficient
learning in robotics and control. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 37(2), 408–423.
Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations
over learned dictionaries. IEEE Transactions on Image Processing, 15(12), 3736–3745.
https://doi.org/10.1109/TIP.2006.881969.
Facchinei, F., Scutari, G., & Sagratella, S. (2015). Parallel selective algorithms for nonconvex
big data optimization. IEEE Transactions on Signal Processing, 63(7), 1874–1889.

51 52 53 54 55 56 57 58 59 60 61