Page 410 -
P. 410

guiDe  Data Mining in the Real World






                       “Now suppose I give the names of the three customers   any good.  We had too  many problems with  data: wrong,   409
                    to a salesperson who calls on them, and sure enough, we   dirty, and missing. There was no way we could know ahead
                    have a stream of bad luck and none of them buys. This bad   of time that it would happen, but it did.
                    result doesn’t mean the model is wrong. But what does the   “When the time came to present the results to senior
                    salesperson think? He thinks the model is worthless, and   management, what could we do? How could we say we took
                    he can do better on his own. He tells his manager who tells   6 months of our time and substantial computer resources
                    her associate, who tells everyone in the Northeast Region,   to create a bad model? We had a model, but I just didn’t
                    and sure enough, the model has a bad reputation all across   think it would make accurate  predictions. I was a junior
                    the company.                                       member of the team, and it wasn’t for me to decide. I kept
                       “Another problem is seasonality. Say all your training   my mouth shut, but I never felt good about it. Fortunately,
                    data are from the summer. Will your model be valid for the   the project was cancelled later for other reasons.
                    winter? Maybe, but maybe not. You might even know that   “However, I’m only talking about my bad experiences.
                    it won’t be valid for predicting winter sales, but if you don’t   Some  of  my  projects  have  been  excellent.  On  many,  we
                    have winter data, what do you do?                  found interesting and important patterns and information,
                       “When  you start a data mining  project,  you never   and a few times I’ve created very accurate predictive mod-
                    know how it will turn out. I worked on one project for 6   els. It’s not easy, though, and you have to be very careful.
                    months, and when we finished, I didn’t think our model was   Also, lucky!”










                                DisCussion Questions



                    1.  Summarize the concerns expressed by this data analyst.  was ineffective, maybe even wrong, what would you do?
                    2.  Do you think the concerns raised here are sufficient to   If your boss  disagrees with your beliefs, would you go
                      avoid data mining projects altogether?              higher in the organization? What are the risks of doing
                    3.  If you were a junior member of a data mining team and   so? What else might you do?
                      you thought that the  model that  had  been  developed
   405   406   407   408   409   410   411   412   413   414   415