Page 87 - Building Big Data Applications
P. 87
82 Building Big Data Applications
Lab research happens 24x7x365
There are several teams of researchers that work on a program
They all conduct experiments constantly
They record all the steps that are taken in the process, the formulas, the outcomes at each stage, the models,
the fi ng of outcomes to the models, any errors and how they are handled
All the data is vital to understand when an error occurs and how to improve the efficiency to prevent the re-
occurrence of the error
The core requirement of a pla orm here is the ability to record and replay the experiment as many mes
needed.
The replay will provide the immense insights that is needed.
A new form of integra on and reuse of informa on with details at each layer and its outcomes will deliver
big benefits
FIGURE 3.7 Research.
of team members who will work different times and execute different experiments. We
need to ensure that all experiments are recorded as they progress with formulas, cal-
culations, outcomes, errors, and any failures.
The experiments steps once recorded will be replayed as many times as needed, and
we can figure out risks and issues as they occur. This experiment use case will provide
more value as we understand the complexity in the process.
The process steps when recreated with integrating everybody’s results will also pro-
vide an opportunity to ensure all risks are mitigated and the final execution run can be
planned and experimented.
The core focus that we have understood is the need to define the information layer
management and processing. This aspect is different in the new world of information as
we start moving into different aspects is data management including DevOps and agile
project management which includes Kanban and Scrum/XP methodologies of code
development. The new world is information floating around us in a continuum, and the
layers of data within this information can be simple to extremely complex. What do we
do to be successful? The first order of business here is to isolate the noise and the value
quotient in the information, which leads us to deliver analytics and performance in-
dicators which are the requirements for business to work with data. This means two
goals to accomplish, first the business needs to own the program and second the
business needs to drive the program, from a governance perspective. The business
subject matter experts will need to line up the rules and models of how the information
layer will be accessed and used, who can access the raw layers of data, and who can
operate the data for consumption by the larger team. The aspects of governance espe-
cially need to be succinct and clear as the noise versus value ratio is very complex to
manage.