Page 19 - Building Big Data Applications
P. 19

Chapter 1   Big Data introduction  13


                 1. Acquire data from all sources. These sources include automobiles, devices, ma-
                   chines, mobile devices, networks, sensors, wearable devices, and anything that pro-
                   duces data.
                 2. Ingest all the acquired data into a data swamp. The key to the ingestion process
                   is to tag the source of the data. Streaming data that needs to be ingested can be
                   processed as streaming data and can also be saved as files. Ingestion also includes
                   sensor and machine data.
                 3. Discover data and perform initial analysis. This process requires tagging and clas-
                   sifying the data based on its source, attributes, significance and need for analytics,
                   and visualization.
                 4. Create a data lake after data discovery is complete. This process involves extract-
                   ing the data from the swamp and enriching it with metadata, semantic data, and
                   taxonomy and adding more quality to it as is feasible. This data is then ready to be
                   used for operational analytics.
                 5. Create data hubs for analytics. This step can enrich the data with master data and
                   other reference data, creating an ecosystem to integrate this data into the database,
                   enterprise data warehouse, and analytical systems. The data at this stage is ready
                   for deep analytics and visualization.
                   The key to note here is that steps 3, 4, and 5 are all helping in creating data lineage,
                 data readiness with enrichment at each stage and a data availability index for usage.

                 Critical factors for success

                 While the steps for processing data are similar to what we do in the world of Big Data, the
                 data here can be big, small, wide, fat, or thin and it can be ingested and qualified for
                 usage. Several critical success factors will result from this journey:
                   Data: You need to acquire, ingest, collect, discover, analyze and implement ana-
                   lytics on the data. This data needs to be defined and governed across the process.
                   And you need to be able to handle more volume, velocity, variety, formats, avail-
                   ability, and ambiguity problems with data.
                   Business Goals: The most critical success factor is defining business goals. Without
                   the right goals, the data is neither useful, nor are the analytics and outcomes from
                   the data useful.
                   Sponsors: Executive sponsorship is needed for the new age of innovation to be
                   successful. If no sponsorship is available, then the analytical outcomes, the lineage
                   and linking of data, and the associated dashboards are all not happening and will
                   be a pipe dream.
                   Subject Matter Experts: The people and teams who are experts in the subject mat-
                   ter are needed to be involved in the Internet of Things journey; they are key to the
                   success of the data analytics and using that analysis.
   14   15   16   17   18   19   20   21   22   23   24