Page 60 - Big Data Analytics for Intelligent Healthcare Management
P. 60

52      CHAPTER 3 BIG DATA ANALYTICS IN HEALTHCARE: A CRITICAL ANALYSIS





                                                                                Generated
                                                                                 reports
                     External healthcare
                         data
                                                         Different big
                                             Transformation  data platforms
                                              of raw data  and analytics
                                     Raw        into     (e.g., hadoop,  Big data
                                    healthcare  recognizable  mapreduce, pig  analytics  Queries
                                     data                  and
                                               big data
                                                format   piglatin, hive
                                                           etc.)
                        Internal
                      healthcare data                                           OLAP for
                                                                                business
                                                                               intelligence
                                                                               Data mining for
                                                                                knowledge
                                                                                discovery

             FIG. 3.5
             Level 1 architectural framework for big data analytics in healthcare data.



             formats such as ASCII/text, flat files, .csv, relational tables etc. [35]. It is clear that medical big data
             analytics is different from traditional big data analytics. The main reason for this is the varying
             nature of raw medical big data and also a heterogeneous source of generation. Because of this,
             an extra preprocessing step is needed for data cleansing, data normalization, and data transforma-
             tion. If it is the case of medical image big data, then special enhancement techniques (most pref-
             erably those with fuzzy based techniques) should be adopted for better enhancement and hence a
             better medical diagnosis.




             3.6.2 DIFFERENT TOOLS USED IN BIG DATA ANALYTICS IN HEALTHCARE DATA
             A number of different platforms and tools are used in big data analytics in healthcare data. Among
             them, the ones [35] which are frequently used are briefly introduced below:
             (a) The Hadoop Distributed File System (HDFS):
                    This is the prime data storage system used by Hadoop applications. Here a NameNode and
                 DataNode architecture is used to implement a DFS (distributed file system) that provides
                 high-performance access to data across highly scalable Hadoop clusters. It has the ability to rapidly
                 transfer data between different computer nodes. The complexity lies in the usage and this its main
                 drawback. Also, it is not suited for small data.
             (b) MapReduce:
                    MapReduce is the central part of the Apache Hadoop software framework. It acts in two vital
                 roles: mapper and reducer. In the first role, it filters and parcels out work to various nodes
                 within the cluster or map and, in the second role, it systematizes and reduces the results from
                 each node into a consistent answer to a query. When tasks are executed, it tracks the processing of
   55   56   57   58   59   60   61   62   63   64   65