Page 115 - Building Big Data Applications
P. 115
Chapter 5 Pharmacy industry applications and usage 111
This innovative use of a knowledge graph lets Novartis bioinformaticians easily model
the complex and changing ways that biological datasets connect to one another,
while the use of Spark allows them to perform graph manipulations reliably and at
scale.
On the analytics side, researchers can access data directly through a Spark API, or
through a number of endpoint databases with schemas tailored to their specific
analytic needs. Their tool chain allows entire schemas with 100 billion of rows to be
created quickly from the knowledge graph and then imported into the analyst’s fa-
vorite database technologies. As a result of their efforts, this flexible workflow tool is
now being used for a variety of different projects across Novartis, including video
analysis, proteomics, and metagenomics.
A wonderful side benefit is that the integration of data science infrastructure into
pipelines built partly from legacy bioinformatics tools can be achieved in mere days,
rather than months. By combining Spark and Hadoop-based workflow and integra-
tion layers, Novartis’ life science researchers are able to take advantage of the tens of
thousands of experiments that public organizations have conducted, which gives
them a significant competitive advantage.
Additional reading
Predict malaria outbreaks (http://ijarcet.org/wp-content/uploads/IJARCET-VOL-4-ISSUE-12-4415-
4419.pdf).
MIT Clinical Machine Learning Group is spearheading the development of next-generation intelligent
electronic health records (http://clinicalml.org/research.html).
MATLAB’s ML handwriting recognition (https://www.mathworks.com/products/demos/machine-
learning/handwriting_recognition/handwriting_recognition.ht technologies).
Google’s Cloud Vision API (https://cloud.google.com/vision/) for optical character recognition.
Stat News (https://www.statnews.com/2016/10/03/machine- learning-medicine-health/).
Advanced predictive analytics in identifying candidates for clinical trials (http://www.mckinsey.com/
industries/pharmaceuticals- and-medical-products/our-insights/how-big-data-can- revolutionize-
pharmaceutical-r-and-d).
The UK’s Royal Society also notes that ML in bio- manufacturing for pharmaceuticals (http://blogs.
royalsociety.org/in- verba/2016/10/05/machine-learning-in-the- pharmaceutical-industry/).
Microsoft’s Project Hanover (http://hanover.azurewebsites.net/) is using ML technologies in multiple
initiatives, including a collaboration with the Knight Cancer Institute (http://www.ohsu.edu/xd/
health/services/cancer/) to develop AI technology for cancer precision treatment).