Page 37 - Building Big Data Applications
P. 37

Chapter 2   Infrastructure and technology  31


                 Replication and recovery

                 In the original design of HDFS there was a single NameNode for each cluster, which
                 became the single point of failure. This has been addressed in the recent releases of
                 HDFS where NameNode replication is now a standard feature like DataNode replication.

                 NameNode and DataNodedcommunication and
                 management

                 The communication and management between a NameNode and DataNodes are
                 managed through a series of handshakes and system ID’s. Upon initial creation and
                 formatting, a namespace ID is assigned to the filesystem on the NameNode. This ID is
                 persistently stored on all the nodes across the cluster. DataNodes similarly are assigned a
                 unique storage_idon the initial creation and registration with a NameNode. This stor-
                 age_id never changes and will be persistent event if the DataNode is started on a
                 different IP Address or Port.
                   During startup process, the NameNode completes its namespace refresh and is ready
                 to establish the communication with the DataNode. To ensure that each DataNode that
                 connects to the NameNode is the correct DataNode, there is a series of verification steps:
                   The DataNode identifies itself to the NamNode with a handshake and verifies its
                   namespace ID and software version.
                   If either does not match with the NameNode, the DataNode automatically shuts
                   down.
                   The signature verification process prevents incorrect nodes from joining the cluster
                   and automatically preserves the integrity of the filesystem.
                   The signature verification process also is an assurance check for consistency of
                   software versions between the NameNode and DataNode since incompatible
                   version can cause data corruption or loss.
                   Post the handshake and validation on the NameNode, a DataNode sends a block
                   report. A block report contains the block id, the length for each block replica and
                   the generation stamp.
                   The first block report is sent immediately upon the DataNode registration.
                   Subsequently hourly updates of the block report is sent to the NameNode, which
                   provides the view of where block replicas are located on the cluster.
                   When a new DataNode is added and initialized, since it does not have a name-
                   space ID is permitted to join the cluster and receive the cluster’s namespace ID.


                 Heartbeats

                 The connectivity between the NameNode and DataNode are managed by the persistent
                 heartbeats that are sent by the DataNode every 3 seconds. The heartbeat provides the
   32   33   34   35   36   37   38   39   40   41   42