Page 392 -
P. 392

Q9-5  How Do Organizations Use Data Mining Applications?

                                               Supervised Data Mining                                                   391
                                               With  supervised data mining, data miners develop a model  prior to the analysis and apply
                                                 statistical techniques to data to estimate parameters of the model. For example, suppose mar-
                                               keting experts in a communications company  believe  that cell  phone usage on  weekends is
                                               determined by the age of the customer and the number of months the customer has had the cell
                                               phone account. A data mining analyst would then run an analysis that estimates the effect of
                                               customer and account age.
                                                   One such analysis, which measures the effect of a set of variables on another variable, is
                                               called a regression analysis. A sample result for the cell phone example is:
                                                               CellphoneWeekendMinutes = 12 + (17.5 * Customer Age)
                                                                                                                                     + (23.7 * NumberMonthsOfAccount)
                                                   Using this equation, analysts can predict the number of minutes of weekend cell phone use
                                               by summing 12, plus 17.5 times the customer’s age, plus 23.7 times the number of months of the
                                               account.
                                                   As you will learn in your statistics classes, considerable skill is required to interpret the  quality
                                               of such a model. The regression tool will create an equation, such as the one shown. Whether that
                                               equation is a good predictor of future cell phone usage depends on statistical factors, such as t val-
                                               ues, confidence intervals, and related statistical techniques.
                                                   Neural networks are another popular supervised data mining application used to pre-
                                               dict values and make classifications such as “good prospect” or “poor prospect” customers.
                                               The term neural networks is deceiving because it connotes a biological process similar to that
                                               in animal brains. In fact, although the original idea of neural nets may have come from the
                                               anatomy and physiology of neurons, a neural network is nothing more than a complicated
                                               set of possibly nonlinear equations. Explaining the techniques used for neural networks is
                                               beyond the scope of this text. If you want to learn more, search http://kdnuggets.com for the
                                               term neural network.
                                                   In  the  next sections,  we  will describe  and illustrate  two  typical data mining  tools—
                                               market-basket analysis and decision trees—and show applications of those techniques. From
                                               this discussion, you can gain a sense of the nature of data mining. These examples should
                                               give you, a future manager, a sense of the possibilities of data mining techniques. You will
                                               need additional coursework in statistics, data management, marketing, and finance, how-
                                               ever, before you will be able to perform such analyses yourself.

                                               Market-Basket Analysis
                                               Suppose you run a dive shop, and one day you realize that one of your salespeople is much better
                                               at up-selling to your customers. Any of your sales associates can fill a customer’s order, but this
                                               one salesperson is especially good at selling customers items in addition to those for which they ask.
                                               One day, you ask him how he does it.
                                                   “It’s simple,” he says. “I just ask myself what is the next product they would want to buy.
                                               If someone buys a dive computer, I don’t try to sell her fins. If she’s buying a dive computer,
                                               she’s already a diver and she already has fins. But these dive computer displays are hard to
                                               read. A  better mask makes it easier to read the display and get the full benefit from the dive
                                               computer.”
                                                   A market-basket analysis is an unsupervised data mining technique for determining sales
                                               patterns. A market-basket analysis shows the products that customers tend to buy together. In
                                               marketing transactions, the fact that customers who buy product X also buy product Y creates a
                                               cross-selling opportunity; that is, “If they’re buying X, sell them Y” or “If they’re buying Y, sell
                                               them X.”
                                                   Figure 9-21 shows hypothetical sales data from 400 sales transactions at a dive shop. The
                                               number on the diagonal (shaded) in the first set of rows is the total number of times an item was
   387   388   389   390   391   392   393   394   395   396   397