Page 103 - Building Big Data Applications
P. 103

5




                 Pharmacy industry applications and


                 usage





                                                   Torture the data, and it will confess to anything.
                                                   Ronald Coase, winner of the Nobel Prize in Economics

                 Pharmaceuticals run extremely complex mathematical analytics compute across all their
                 processes. The interesting viewpoint here is the fact that they are dwelling in a world of
                 data complexity, which needs to be understood across different layers with the appro-
                 priate blending of insights. The incorrect application of formulas and calculations will
                 only misguide us to incorrect conclusions. Data is something that you have to manip-
                 ulate to get at key truths. How you decide to treat your data can vastly affect what
                 conclusions you make.
                   In the use case analysis in this chapter, we will discuss the implementation of Hadoop
                 by Novartis, and their approach to overcome challenges faced in the traditional data
                 worlds for complexity of compute and multi-user access of the underlying data across the
                 same time for other analytics and reporting. We will discuss the facets of big data appli-
                 cations looking into accessing streaming data sets, data computations using in-memory
                 architectures, distributed data processing for creating data lakes and analytical hubs, in-
                 process visualizations and decision support, the data science team and how the change
                 needs to happen for successful creation of big data applications. We will discuss the usage
                 and compliance requirements for the data, the security, encryption, storage, compression,
                 and retention specific topics as related to pharmaceutical industry.
                   Complexity is a very sensitive subject in the world of data and analytics. By definition
                 it deals with processes that are interconnected and have dependencies that may be
                 visible or hidden, often leading to chaos in the processing of the data. These systems
                 have exhibited characteristics including but not limited to the following:

                   The number of parts (and types of parts) in the system and the number of re-
                   lations between the parts is nontrivial. There is no general rule to separate “trivial”
                   from “nontrivial”, and it depends on the owner of the data to define these rules
                   and document them with flows and dependencies. This issue is essential to deal
                   with as large systems can get complex and become less used, which is both cost
                   and productivity loss.
                   The system has memory or includes feedback and loops which are needed to be
                   defined and exit strategies need to be validated for each operation. Scientific
                   compute falls into this category and there are several case studies of experiments


                 Building Big Data Applications. https://doi.org/10.1016/B978-0-12-815746-6.00005-3  99
                 Copyright © 2020 Elsevier Inc. All rights reserved.
   98   99   100   101   102   103   104   105   106   107   108