Page 224 - Building Big Data Applications
P. 224

224   Index


             Hive (Continued)                              Google architecture, 23e25
               design goals, 49                            programming model, 23
               execution, 51e53                           Pig Latin. See Pig Latin
               infrastructure, 51                       Infrastructure fatigue, 200e201, 201f
               process flow, 52f                         Intensity frontier, 88
             Hotel shopping APIs, 155
             HRegionServer, 47                          J
                                                        Jenkins, 184
             I                                          Journal, 29
             Image, 29                                  Juju, 189
             Infrastructure and technology
               Basho Riak, 68e69                        L
               big data processing                      Large ElectronePositron Collider (LEP), 88
                 requirements, 21e22                    Large Hadron Collider (LHC)
                 technologies, 22                         ALEPH detector, 89
               Cassandra. See Cassandra                   data calculations, 90
               distributed data processing                data generation, 91e92, 91f
                 clienteserver data processing, 19, 19f   data processing architecture, 92
                 distributed file based storage, 20        DELPHI detector, 89
                 extreme parallel processing, 20          detectors, 89
                 fault tolerance, 20                      experiments, 88, 90
                 generic new generation distributed data  L3 detector, 89
                   architecture, 21f                      location and components, 88
                 high-speed replication, 19               OPAL detector, 89
                 limitations, 20e21                       Worldwide LHC Computing Grid (WLCG),
                 linearly scalable infrastructure, 18        90
                 localized processing, 19               Leptons, 87e88
                 mastereslave configuration, 18          Logistic regression models, 143
                 minimal database usage, 20
                 object oriented programming, 19e20     M
                 programmable APIs, 20                  Machine intelligence, 14
                 relational database management system  Machine learning, 11, 128
                   (RDBMS), 18, 18f                     MapReduce code, 123
               document-oriented databases, 69e70       Market interactions, 73e74, 74f
               graph databases, 70                      Master data
               Hadoop                                     business entities, 162
                 core components, 26e28, 27f              centralized master data management
                 Hadoop distributed filesystem (HDFS).        system, 162
                   See Hadoop distributed filesystem       data governance and stewardship teams,
                   (HDFS)                                    162e163
                 history, 26                              end state architecture, 162
               HBASE. See HBASE                           technology platform, 162
               Hive. See Hive                           mcommerce model, 78e79
               MapReduce                                Metadata
                 features, 22e23                          business intelligence metadata, 161
   219   220   221   222   223   224   225   226   227   228