Page 165 -
P. 165

132     Part 2  •  InformatIon requIrements analysIs

                                         across the country. You might want to select 1 or 2 of these help desks, under the assumption that
                                         they are typical of the remaining ones.
                                         DECIDING ON THE SAMPLE SIZE.  Obviously, if everyone in the population viewed the world the
                                         same way or if each of the documents in a population contained exactly the same information as
                                         every other document, a sample size of one would be sufficient. Because that is not the case, it
                                         is necessary to set a sample size greater than one but less than the size of the population itself.
                                             It is important to remember that the absolute number is more important in sampling than the
                                         percentage of the population. We can obtain satisfactory results sampling 20 people in 200 or 20
                                         people in 2,000,000.

                                         The Sample Size Decision
                                         The sample size often depends on the cost involved or the time required by the systems analyst,
                                         or even the time available from people in the organization. This subsection gives some guidelines
                                         for determining the required sample size under ideal conditions—for example, to determine what
                                         percentage of input forms contain errors or what proportion of people to interview.
                                             A systems analyst needs to follow seven steps, some of which involve subjective judgments,
                                         to determine the required sample size:
                                           1. Determine the attribute (in this case, the type of errors to look for).
                                           2. Locate the database or reports in which the attribute can be found.
                                           3. Examine the attribute. Estimate p, the proportion of the population having the attribute.
                                           4. Make the subjective decision regarding the acceptable interval estimate, i.
                                           5. Choose the confidence level and look up the confidence coefficient (z value) in a table.
                                           6. Calculate s , the standard error of the proportion, as follows:
                                                      p
                                                                                  i
                                                                             s 5  z
                                                                              p
                                           7. Determine the necessary sample size, n, using the following formula:

                                                                            p11 2 p2
                                                                        n 5     2   1 1
                                                                               s p
                                         The first step, of course, is to determine which attribute you will be sampling. Once this is done,
                                         you can find out where this data is stored, perhaps in a database, on a form, or in a report.
                                             It is important to estimate p, the proportion of the population having the attribute, so
                                         that you set the appropriate sample size. Many textbooks on systems analysis suggest using
                                         a heuristic of 0.25 for p(1 − p). This value almost always results in a sample size larger than
                                         necessary because 0.25 is the maximum value of p(1 − p), which occurs only when p = 0.50.
                                         When p = 0.10, as is more often the case, p(1 − p) becomes 0.09, resulting in a much smaller
                                         sample size.
                                             Steps 4 and 5 are subjective decisions. The acceptable interval estimate of ±0.10 means that
                                         you are willing to accept an error of no more than 0.10 in either direction from the actual pro-
                                         portion, p. The confidence level is the desired degree of certainty, such as 95 percent. Once the
                                         confidence level is chosen, the confidence coefficient (also called a z value) can be looked up in
                                         a table like the one found in this chapter.
                                             Steps 6 and 7 complete the process by taking the parameters found or set in steps 3 through
                                         5 and entering them into two equations to eventually solve the required sample size.

                                            EXAMPLE
                                            The foregoing steps can best be illustrated by an example. Suppose the A. Sembly
                                            Company, a large manufacturer of shelving products, asks you to determine what percent-
                                            age of orders contain errors. You agree to do this job and perform the following steps:
                                              1.  Determine that you will be looking for orders that contain mistakes in names,
                                               addresses, quantities, or model numbers.
                                              2.  Locate copies of order forms from the past six months.
                                              3.  Examine some of the order forms and conclude that only about 5 percent (0.05) contain
                                               errors.
   160   161   162   163   164   165   166   167   168   169   170