Page 67 - Building Big Data Applications
P. 67
Chapter 2 Infrastructure and technology 61
FIGURE 2.23 Cassandra keyevalue pair (column).
A keyevalue pair can represent a simple storage for Person/Name type of data but
cannot scale much. An alteration to the basic model is done to create a name and value in
the keyevalue pair, this would provide a structure to create multiple values and associate
a key to the nameevalue pair. This creates a table like structure described in Fig. 2.23.
In the updated structure of the keyevalue notation, we can store Person / Name /
John Doe, add another column called Person/Age/30 and create multiple storage
structures. This defines the most basic structure in the Cassandra data model called
“column”.
Columndis an ordered list of values stored as a nameevalue pair. It is composed of a
column name, a column value, and a third element called timestamp. The timestamp is
used to manage conflict resolution on the server, when there is a conflicting list of values
or columns to be managed. The client sets the timestamp that is stored along with the
data, and this is an explicit operation.
The column can hold any type of data in this model, varying from characters to GUID
to blobs. Columns can be grouped into a row called as rowkey. A simple column by itself
limits the values you can represent, to add more flexibility, a group of columns belonging
to a key can be stored together called as a column family. A column family can be loosely
compared to a table in the database comparison.
Column Familydis a logical and physical grouping of a set of columns, which can be
represented by a single key. The flexibility of column family is that the names of columns
can vary from a row to another and the number of columns can vary over a period of
time (Fig. 2.24).
There is no limitation with creating different column structures in a column family,
except the maintenance of the same is dependent on the application that is creating the
different structures. Conceptually it is similar to overloading in the object-oriented
programming language.
Name Name Name …
RowKey
…
Value Value Value
FIGURE 2.24 A column family representation.