Page 224 - Building Big Data Applications

P. 224

224 Index

Hive (Continued) Google architecture, 23e25
design goals, 49 programming model, 23
execution, 51e53 Pig Latin. See Pig Latin
infrastructure, 51 Infrastructure fatigue, 200e201, 201f
process ﬂow, 52f Intensity frontier, 88
Hotel shopping APIs, 155
HRegionServer, 47 J
Jenkins, 184
I Journal, 29
Image, 29 Juju, 189
Infrastructure and technology
Basho Riak, 68e69 L
big data processing Large ElectronePositron Collider (LEP), 88
requirements, 21e22 Large Hadron Collider (LHC)
technologies, 22 ALEPH detector, 89
Cassandra. See Cassandra data calculations, 90
distributed data processing data generation, 91e92, 91f
clienteserver data processing, 19, 19f data processing architecture, 92
distributed ﬁle based storage, 20 DELPHI detector, 89
extreme parallel processing, 20 detectors, 89
fault tolerance, 20 experiments, 88, 90
generic new generation distributed data L3 detector, 89
architecture, 21f location and components, 88
high-speed replication, 19 OPAL detector, 89
limitations, 20e21 Worldwide LHC Computing Grid (WLCG),
linearly scalable infrastructure, 18 90
localized processing, 19 Leptons, 87e88
mastereslave conﬁguration, 18 Logistic regression models, 143
minimal database usage, 20
object oriented programming, 19e20 M
programmable APIs, 20 Machine intelligence, 14
relational database management system Machine learning, 11, 128
(RDBMS), 18, 18f MapReduce code, 123
document-oriented databases, 69e70 Market interactions, 73e74, 74f
graph databases, 70 Master data
Hadoop business entities, 162
core components, 26e28, 27f centralized master data management
Hadoop distributed ﬁlesystem (HDFS). system, 162
See Hadoop distributed ﬁlesystem data governance and stewardship teams,
(HDFS) 162e163
history, 26 end state architecture, 162
HBASE. See HBASE technology platform, 162
Hive. See Hive mcommerce model, 78e79
MapReduce Metadata
features, 22e23 business intelligence metadata, 161

219 220 221 222 223 224 225 226 227 228