Page 223 - Building Big Data Applications
P. 223

Index  223


                 Enterprise data warehouse (EDW), 6        Google MapReduce cluster, 24f
                 Eroom’s law, 106                            architecture, 25
                 European Council for Nuclear Research       chunkservers, 24
                      (CERN)                                 corruption, 24e25
                  Hadoop configuration, 92                    input data files, 23
                  Higgs Boson discovery, 85e86               metadata, 24
                    Big Bang theory, 96                      single point of failure (SPOF), 24
                    drag force, 94                         Graph databases, 14, 70
                    governance, 97
                    Large Hadron Collider (LHC), 95        H
                    mathematical studies, 94               Hadoop distributed filesystem (HDFS)
                    open source adoption, 97                 architecture, 28, 29f
                    quantum physics, 94                      BackupNode, 33
                    solution segment, 96                     block allocation and storage, 30
                  high-energy accelerators, 86               Checkpoint, 29e30
                  Large Hadron Collider (LHC)                CheckpointNode, 32
                    ALEPH detector, 89                       Chukwa, 54
                    data calculations, 90                    client, 30
                    data generation, 91e92, 91f              data processing problem, 27e28
                    data processing architecture, 92         DataNode, 28e29
                    DELPHI detector, 89                      Filesystem snapshots, 33e36
                    detectors, 89                            fundamental design principles, 27e28
                    experiments, 88, 90                      Image, 29
                    L3 detector, 89                          Journal, 29
                    location and components, 88              NameNode, 28
                    OPAL detector, 89                        principle goals, 28
                    Worldwide LHC Computing Grid             replication and recovery, 31
                      (WLCG), 90                             startup, 30
                  mass and energy measurement, 86          Hadoop technology, 9
                  PySpark implementation, 92               HBASE
                  quarks and leptons, 87e88                  architecture implementation, 47e49
                  service for web-based analysis (SWAN), 93  components, 48f
                  Standard Model Higgs boson, 86             data model, 46, 47f
                  XRootD filesystem interface project, 93     HBaseMaster, 47
                 Execution Engine, Hive architecture, 50     HRegionServer, 47
                                                             META table, 48e49
                 F                                           ROOT table, 48e49
                 Filesystem snapshots, 33e36               HBaseMaster, 47
                 Flight APIs, 154                          HCatalog, 54e58
                 Flume, 54                                 High frequency trades (HFTs), 137
                                                           High-Performance Computing (HPC), 110
                 G                                         Hinted handoff, 65
                 Ganglia, 185                              Hive
                 General Data Protection Regulation          architecture, 50e51, 50f
                      (GDPR), 211                            data types, 53
   218   219   220   221   222   223   224   225   226   227   228