Page 15 - Building Big Data Applications
P. 15
Chapter 1 Big Data introduction 9
Big Data infrastructure which will be leveraged with applications. There are two platforms
which we have created and they are Hadoop and NoSQL.
HadoopdThe platform originated in the world of Internet with Yahoo buying out
Apache Nutch and implementing a platform that can perform infinite crawls of the web
and provide search results. This infinite capability came with four basic design goals that
were defined for Hadoop:
System shall manage and heal itself
Performance shall scale linearly
Compute shall move to data
Simple core, modular, and extensible
These goals were needed for the Internet because we do not have the patience to wait
beyond a few milliseconds and often move away to other areas if we do not get answers.
The biggest benefit of these goals is the availability of the platform 24 7 365 with
data always there as soon as it can be created and acquired into the platform. Today all
the vendors have started adopting a Hadoop-driven interface and moving the on-
premise to a cloud model and have integrated with in-memory processing and HDFS.
We will see in upcoming chapters the details of the stack and how it has helped in
multiple implementations.
Not-only-SQL (NoSQL) as we know it evolved into the web database platform that
was designed to move away from the ACID compliant database and create a replication-
based model to ingest and replicate data based on system requirements. We have seen
the evolution of Cassandra, MongoDB, HBase, Amazon Dynamo, Apache Giraph, and
MarkLogic. These NoSQL databases have all delivered solutions that have created ana-
lytics and insights like never before. These databases have been accepted into the en-
terprise but are yet to gain the adoption. We will discuss these databases and their
implementations in the following chapters.
Building Big Data applications
Internet of Things evolves rapidly and grows at a fascinating pace bringing increasing
opportunities to innovate at a continuum with capabilities to play and replay the events at
occurrence and observe the effects as the event unfolds. Today we are equipped with the
technology layers needed to make this paradigm shift and let the entire team of people
whether in one location or across the globe to collaborate and understand the power of
data. The paradigm shift did not occur easily and it took time to mature, but once it did hit
reality the world has not stopped going through the tipping point multiple times.
The 10 commandments to building Big Data applications:
1. Data is the new paradigm shift. We need to understand that the world revolves
around actions and counteractions from human beings and systems they are con-
nected to. All of these actions produce data, which if harnessed and aligned will