Page 66 - Building Big Data Applications
P. 66
60 Building Big Data Applications
compliance. Dynamo and Memcached inspired the database architecture. Data is stored
as a key with values in conjunction as a pair. Data is organized in a ring topology with
redundancy and range management built into each node of the ring. The architecture is
very niche in solving problems and hence did not get wide adoption outside of LinkedIn.
It is still being evolved and updated at this time of writing.
Cassandra
Facebook in the initial years had used a leading commercial database solution for their
internal architecture in conjunction with some Hadoop. Eventually the tsunami of users
led the company to start thinking in terms of unlimited scalability and focus on avail-
ability and distribution. The nature of the data and its producers and consumers did not
mandate consistency but needed unlimited availability and scalable performance. The
team at Facebook built an architecture that combines the data model approaches of
Bigtable and the infrastructure approaches of Dynamo with scalability and performance
capabilities named Cassandra. Often referred as hybrid architecture as it combines the
column-oriented data model from Bigtable with Hadoop MapReduce jobs and it im-
plements the patterns from dynamo like eventually consistent, gossip protocols, a
masteremaster way of serving both read and write requests. Cassandra supports a full
replication model based on NoSQLarchitectures.
Cassandra team had a few design goals to meet, considering the architecture at the
time of first development and deployment was primarily being deployed at Facebook.
The goals included the following:
High availability
Eventual consistency
Incremental scalability
Optimistic replication
Tunable tradeoffs between consistency, durability, and latency
Low cost of ownership
Minimal administration
Data model
Cassandra datamodel is based on a keyevalue model, where we have a key that uniquely
identifies a value and this value can be structured or completely unstructured or can also
be a collection of other keyevalue elements. This is very similar to pointers and linked
lists in the world of programming. Fig. 2.22 shows the basic keyevalue structure.
FIGURE 2.22 Keyevalue pair.