Page 115 - Building Big Data Applications
P. 115

Chapter 5   Pharmacy industry applications and usage  111


                   This innovative use of a knowledge graph lets Novartis bioinformaticians easily model
                   the complex and changing ways that biological datasets connect to one another,
                   while the use of Spark allows them to perform graph manipulations reliably and at
                   scale.
                   On the analytics side, researchers can access data directly through a Spark API, or
                   through a number of endpoint databases with schemas tailored to their specific
                   analytic needs. Their tool chain allows entire schemas with 100 billion of rows to be
                   created quickly from the knowledge graph and then imported into the analyst’s fa-
                   vorite database technologies. As a result of their efforts, this flexible workflow tool is
                   now being used for a variety of different projects across Novartis, including video
                   analysis, proteomics, and metagenomics.
                   A wonderful side benefit is that the integration of data science infrastructure into
                   pipelines built partly from legacy bioinformatics tools can be achieved in mere days,
                   rather than months. By combining Spark and Hadoop-based workflow and integra-
                   tion layers, Novartis’ life science researchers are able to take advantage of the tens of
                   thousands of experiments that public organizations have conducted, which gives
                   them a significant competitive advantage.


                 Additional reading

                 Predict  malaria  outbreaks  (http://ijarcet.org/wp-content/uploads/IJARCET-VOL-4-ISSUE-12-4415-
                   4419.pdf).
                 MIT Clinical Machine Learning Group is spearheading the development of next-generation intelligent
                   electronic health records (http://clinicalml.org/research.html).
                 MATLAB’s ML handwriting recognition (https://www.mathworks.com/products/demos/machine-
                   learning/handwriting_recognition/handwriting_recognition.ht technologies).
                 Google’s Cloud Vision API (https://cloud.google.com/vision/) for optical character recognition.
                 Stat News (https://www.statnews.com/2016/10/03/machine- learning-medicine-health/).
                 Advanced predictive analytics in identifying candidates for clinical trials (http://www.mckinsey.com/
                   industries/pharmaceuticals- and-medical-products/our-insights/how-big-data-can- revolutionize-
                   pharmaceutical-r-and-d).
                 The UK’s Royal Society also notes that ML in bio- manufacturing for pharmaceuticals (http://blogs.
                   royalsociety.org/in- verba/2016/10/05/machine-learning-in-the- pharmaceutical-industry/).
                 Microsoft’s Project Hanover (http://hanover.azurewebsites.net/) is using ML technologies in multiple
                   initiatives, including a collaboration with the Knight Cancer Institute (http://www.ohsu.edu/xd/
                   health/services/cancer/) to develop AI technology for cancer precision treatment).
   110   111   112   113   114   115   116   117   118   119   120