Page 48 - Building Big Data Applications
P. 48
42 Building Big Data Applications
update. Reads can be from the leader or followers and happen in memory.
Followers sometimes lag in read operations and eventually become consistent. This
phase is finished once a majority (or quorum) of followers have synchronized their
state with the leader.
Zab implements the following optimizations to circumvent the bottleneck of a
leader:
Clients can connect to any server, and servers have to serve read operations
locally and maintain information about the session of a client. This extra load of
a follower process (a process that is not a leader) makes the load more evenly
distributed
The number of servers involved is small. This means that the network commu-
nication overhead does not become the bottleneck that can affect fixed
sequencer protocols
2. Atomic broadcastdWritten requests and updates are committed in a two-phase
approach in Zab. To maintain consistency across the ensemble, write requests are
always communicated to the leader. The leader broadcasts the update to all its fol-
lowers. When a quorum of its followers (in Fig. 2.12 we need three followers) have
persisted the change (phase 1 of a two-phase commit), the leader commits the up-
date (phase 2 of the commit), and the requestor gets a response saying the update
succeeded. The protocol for achieving consensus is designed to be atomic, so a
change either succeeds or fails completely.
Locks and processing
One of the biggest issues in distributed data processing is lock management, when one
session has anexclusivelockona server. Zookeepermanages thisprocessbycreating a listof
child nodes and lock nodes and the associated queues of waiting processes for lock release.
The lock node is allocated to the next process that is waiting based on the order received.
Lock management is done through a set of watches. If you become overzealous and
set a large number of locks it will become a nightmare and creates a herd effect on the
Zookeeper service. Typically a watch is set on the preceding process that is currently
holding a lock.
Failure and recovery
A common issue in coordinating a large number of processes is connection loss. When
failover process needs information on the children affected by the connection loss to
complete the failover. To manage this the client session id information is associated with
child zNodes and locks, which will enable the failover client to synchronize.
ZooKeeper is a highly available system, and it is critical that it can perform its
functions in a timely manner. It is recommended to run ZooKeeper on dedicated ma-
chines. Running it in a shared services environment will adversely impact performance.