Page 53 - Building Big Data Applications

P. 53

Chapter 2 Infrastructure and technology 47

The ﬂexibility of this type of a data model organization allows HBASE to store data as
column oriented grouped by column families by design. The columns can be expanded
based on the data loaded as long as it belongs to the row that it is loaded to and has
predeﬁned column groups in the data model.
As seen in Fig. 2.13 there are multiple records for one rowkey and one row for the
other. The ﬂexibility to store this data in column groups allows us to store more data and
query more data in the same query cycle. These are the powerful data model structures
that are implemented in a larger architecture within Hadoop.

HBASE architecture implementation

There are three main components that together form the HBASE architecture from an
implementation model perspective (Fig. 2.14)

The HBaseMasterdis the key controller of operations in HBASE. The main func-
tions of the master include the following:
Responsible for monitoring region servers (one or more clusters)
Load balancing for regions
Redirect client to correct region servers

To manage redundancy the master can be replicated. Like the master in MapReduce,
the master in HBASE stores no data and only has metadata about the region servers.

The HRegionServerdis the slave node in the HBASE architecture. Its primary func-
tions include the following:
Storing the data and its metadata
Serving requests(Write/Read/Scan) of client
Send heartbeat to master
Manage splits and synchronize with master on the split and data allocation

Row key TS Column “recipe:”

“recipe:
“www.foodie.com” t10 “FOODIE”
foodie.com”
“recipe:
t9 “FOODTV.COM”
foodtv.com”
“www.foodtv.com”
“recipe:
t8 “FOODTV.COM”
food.spicy.in”
FIGURE 2.13 HBASE data model example.

48 49 50 51 52 53 54 55 56 57 58