Page 207 - Building Big Data Applications
P. 207
Chapter 11 Data discovery and connectivity 207
worry taking the risk, and our drivers and passengers for the AI segments are bolder and
daring to experiment with the new toolkits.
The potential of artificial intelligence is obvious. Adobe Inc has estimated that they
can expect 31% of all organizations will start using AI. This statistic is backed up by the
fact that there are more startups than ever focusing their operations on AI and the
services that it provides to the masses. Roughly 61% of organizations that follow inno-
vative strategies have turned to AI for taking extracts from the data that they previously
might have missed. Innovation is an earmark of artificial intelligence, and startup ideas
that believe in this ideology can seldom live without the oxygen that is AI.
Not only are marketers confident about AI, but consumers are also starting to grasp
its vast potential. About 38% of consumers are beginning to believe that AI will improve
customer service. With this growing awareness and popularity, we can expect these
numbers to increase further down the line.
Challenges before you start with AI
Organizations are finding it hard to find their footing under the 4 V’s of big data; volume,
variety, veracity, and velocity. Over 38% of the analytics and data decision-makers from
the market reported that their unstructured, semistructured and structured data pools
made an increase of 1000 TB in the year 2017. The growth of data is increasing rapidly, as
are the initiatives organizations are taking to extract value from it. Herein lie numerous
challenges that organizations must overcome to extract full value from AI.
These complications are as follows:
Getting Quality Data
Your inference tools would only be as good as the data you have with you. If
the data that you are feeding your machines is not structured and flawless, the
inference gained from it would barely make the cut for your organization. Thus,
the first step of the process is to have quality data.
Without the presence of trust in the quality of the data, they would not proceed
with their AI initiative. This demonstrates the importance of data quality in AI,
and how it changes the perspective of stakeholders involved.
The pareto concept applies here, as data scientists are bound to spend almost
80% of their time making data ready for analysis and then the remaining 20%
for performing analysis on the prepared data. The creation of these data sets for
the ultimate analysis is key to the overall success of the program, which is why
scientists have to allot their time.
The 80/20 phenomenon has been noted by many analysts online, who believe
that 80% of a data scientist’s valuable time is spent finding, reorganizing, and
cleaning up huge amounts of data.