Page 154 - Intelligent Digital Oil And Gas Fields
P. 154

Components of Artificial Intelligence and Data Analytics     117


              Although the data mining is rapidly gaining rightful popularity—particularly
              in conjunction with Big Data analytics, where DOF applications are cer-
              tainly not an exception—a caveat related to the risk that a data mining analyst
              may discover patterns that are meaningless, because they are not supported
              by the data exists. Consequently, this effect, the statisticians call Bonferroni’s
              Principle (Leskovec et al., 2014), may, for example, generate statistical arti-
              facts rather than evidence of the conducted search and lead to unrealistic pre-
              dictive models. The solution comes in the form of the Bonferroni
              correction, when several dependent or independent statistical tests are being
              performed simultaneously on a single data set.


              4.2.2 Statistical and Machine Learning

              Although the terms statistical learning and ML differ by name, they are quite
              similar, and, in fact, both types of learning are inseparably intertwined. Sta-
              tistical learning refers to the set of tools for modeling and understanding com-
              plex and large-scale data sets, such as Big Data. It is a fairly recently developed
              area of statistics and largely complements the developments in computer sci-
              ences (e.g., advanced data management and cloud computing) and ML. ML
              addressesthequestionof“howtobuildcomputersthatimproveautomatically
              through experience” (Jordan and Mitchell, 2015). This section gives a brief
              overview of the core ML methods and outlines some trends and prospects for
              future developments. It summarizes the most popular ML techniques, high-
              lightsits threemainparadigms, andprovidescharacteristicexamples.AsMLis
              becoming increasingly popular in the E&P industry, a few successful applica-
              tions relevant to the DOF are presented in Section 4.3.
                 Conceptually, ML algorithms can be viewed as navigating through a
              large domain of candidate programs to identify a program that optimizes
              a specified performance metric or objective. The application of ML algo-
              rithms varies greatly depending on the nature of the problem, for example,
              through use of decision trees, mathematical functions, optimization, etc.
              However, with the vast amount of Big Data, it is imperative that the
              common denominator of ML techniques appropriate for DOF applica-
              tions become highly scalable solutions which support the platforms of
              the cloud and HPC, real-time analytics, and the rapidly expanding IoT,
              all with robust and resilient cybersecurity mechanisms (see Chapter 2,
              Instrumentation and Measurement). For more information see The Elements
              of Statistical Learning: Data Mining, Inference and Prediction by Hastie et al.
              (2011), An Introduction to Statistical learning: with applications in R by
   149   150   151   152   153   154   155   156   157   158   159