Page 36 - Building Big Data Applications

P. 36

30 Building Big Data Applications

during the next startup, on a restart or on demand when requested by the administrator
or by the CheckpointNode (described later in this chapter).

HDFS startup

Since the image is an in-memory persistence, during initial startup everytime, the
NameNode initializes a namespace image from the checkpoint ﬁle and replays changes
from the journal. Once the startup sequence completes the process, a new checkpoint
and an empty journal are written back to the storage directories and the NameNode
starts serving client requests. For improved redundancy and reliability, copies of
checkpoint and journal can be made at other servers.

Block allocation and storage

Data organization in the HDFS is managed similar to GFS. The namespace is represented
by inodes, which represent ﬁles. Directories and records attributes like permissions,
modiﬁcation, and access times, namespace and disk space quotas. The ﬁles are split into
use-deﬁned block sizes (default is 128 MB) and stored into a DataNode and two replicas
at a minimum to ensure availability and redundancy, though the user can conﬁgure
more replicas. Typically the storage location of block replicas may change over time and
hence are not part of the persistent checkpoint.

HDFS client

A thin layer of interface that is used by programs to access data stored within HDFS, is
called the Client. The client ﬁrst contacts the NameNode to get the locations of data
blocks that comprise the ﬁle. Once the block data is returned to the client, subsequently
the client reads block contents from the DataNode closest to it.
When writing data, the client ﬁrst requests the NameNode to provide the DataNodes
where the data can be written. The NameNode returns the block to write the data. When
the ﬁrst block is ﬁlled, additional blocks are provide by the NameNode in a pipeline. A
block for each request might not be on the same DataNode.
One of the biggest design differentiators of HDFS is the API that exposes the loca-
tions of a ﬁle blocks. This allows applications like MapReduce to schedule a task to
where the data is located, thus improving the IO performance. The API also includes
functionality to set the replication factor for each ﬁle. To maintain ﬁle and block
integrity, once a block is assigned to a DataNode, two ﬁles are created to represent each
replica in the local host’s native ﬁlesystem. The ﬁrst ﬁle contains the data itself and the
second ﬁle is block’s metadata including checksums for each data block and generation
stamp.

31 32 33 34 35 36 37 38 39 40 41