Page 157 - Artificial Intelligence in the Age of Neural Networks and Brain Computing
P. 157

146    CHAPTER 7 Pitfalls and Opportunities in the Development of AI Systems




                            We don’t have enough data. We never do. We need data to develop our CI
                         classifier, including feature identification, architecture selection, parameter tuning
                         (including when to stop training), and finally performance evaluation. In particular,
                         the amount of data directly limits the complexity of the classifier which we can
                         utilize. There is a famous theorem popularized by the eminent information theorist
                         Thomas Cover which bears on this point. Unless the number of our training cases
                         is at least twice the number of features we are trying to use for our classifier, we
                         are practically guaranteed perfect performance for any arbitrary assignment of class
                         labels to our cases [8]. We revisit this point again under algorithm development.
                            One valuable method for overcoming a lack of sufficient data for our task is to uti-
                         lize “similar” data, either simulated or natural in a pretraining phase of our algorithm
                         development (as is of course natural for the human brain). Deep learning techniques,
                         in particular, require vast amounts of data, so that a boost from training on a related
                         dataset may enable us to overcome our own insufficiencies. For example, if we want to
                         identify dogs, we might train our network on the universe of cats out there on the
                         internet. Just don’t be surprised if later our CI is much better with chihuahuas than
                         doberman pinschers. Inevitably we are still stuck with the limitations of our own
                         particular data, with the same perils of overtraining of any limited set plus the biases
                         introduced by the pretraining.
                            Life is hard, but we move on. Our data are crap: Get over it.


                         2.2 OUR ALGORITHM IS CRAP
                         Our feature selection is wrong. It is always wrong. Our chances (or those of our deep
                         learning machine, no matter how deep) of wringing the ideal features from a massive
                         database are miniscule. That doesn’t mean we shouldn’t try; however, it means it’s
                         a much tougher problem than we think it is, and we aren’t going to ever (totally)
                         succeed. Chen and Brown [9] show what happens when simulated microarray
                         data are used with 30 known true (discriminatory) features out of 1000 and known
                         noise levels for all features. For this problem, out of 30 features selected as “sticking
                         farthest out of the noise” on average only three were valid discriminatory ones for
                         low intrinsic separation and nine for high separation conditions.
                            This problem is not alleviated when our CI observer is choosing its own features.
                         Azriel Rosenfeld was a prominent AI researcher at the University of Maryland who
                         did a lot of interesting work on object recognition (and self-driving vehicles). He
                         used to tell a story about one of his early successes. He had a contract with the
                         army to develop an algorithm to identify battle tanks. He had data for scenes with
                         and without tanks, and was enthusiastic and then puzzled when his CI performed
                         brilliantlydtoo brilliantly. Even if only a very small portion of a tank were visible
                         in a scene, his CI observer would say a tank were there. Upon reflection he saw that
                         the tankless scenes had been shot on a cloudy day and the with-tank ones on a sunny
                         day. That was the sole feature his CI had needed. Our CI observer may not be very
                         smart, but it is smarter than we are, in its own sly way.
   152   153   154   155   156   157   158   159   160   161   162