Page 165 -

P. 165

132 Part 2 • InformatIon requIrements analysIs

across the country. You might want to select 1 or 2 of these help desks, under the assumption that
they are typical of the remaining ones.
DECIDING ON THE SAMPLE SIZE. Obviously, if everyone in the population viewed the world the
same way or if each of the documents in a population contained exactly the same information as
every other document, a sample size of one would be sufficient. Because that is not the case, it
is necessary to set a sample size greater than one but less than the size of the population itself.
It is important to remember that the absolute number is more important in sampling than the
percentage of the population. We can obtain satisfactory results sampling 20 people in 200 or 20
people in 2,000,000.

The Sample Size Decision
The sample size often depends on the cost involved or the time required by the systems analyst,
or even the time available from people in the organization. This subsection gives some guidelines
for determining the required sample size under ideal conditions—for example, to determine what
percentage of input forms contain errors or what proportion of people to interview.
A systems analyst needs to follow seven steps, some of which involve subjective judgments,
to determine the required sample size:
1. Determine the attribute (in this case, the type of errors to look for).
2. Locate the database or reports in which the attribute can be found.
3. Examine the attribute. Estimate p, the proportion of the population having the attribute.
4. Make the subjective decision regarding the acceptable interval estimate, i.
5. Choose the confidence level and look up the confidence coefficient (z value) in a table.
6. Calculate s , the standard error of the proportion, as follows:
p
i
s 5 z
p
7. Determine the necessary sample size, n, using the following formula:

p11 2 p2
n 5 2 1 1
s p
The first step, of course, is to determine which attribute you will be sampling. Once this is done,
you can find out where this data is stored, perhaps in a database, on a form, or in a report.
It is important to estimate p, the proportion of the population having the attribute, so
that you set the appropriate sample size. Many textbooks on systems analysis suggest using
a heuristic of 0.25 for p(1 − p). This value almost always results in a sample size larger than
necessary because 0.25 is the maximum value of p(1 − p), which occurs only when p = 0.50.
When p = 0.10, as is more often the case, p(1 − p) becomes 0.09, resulting in a much smaller
sample size.
Steps 4 and 5 are subjective decisions. The acceptable interval estimate of ±0.10 means that
you are willing to accept an error of no more than 0.10 in either direction from the actual pro-
portion, p. The confidence level is the desired degree of certainty, such as 95 percent. Once the
confidence level is chosen, the confidence coefficient (also called a z value) can be looked up in
a table like the one found in this chapter.
Steps 6 and 7 complete the process by taking the parameters found or set in steps 3 through
5 and entering them into two equations to eventually solve the required sample size.

EXAMPLE
The foregoing steps can best be illustrated by an example. Suppose the A. Sembly
Company, a large manufacturer of shelving products, asks you to determine what percent-
age of orders contain errors. You agree to do this job and perform the following steps:
1. Determine that you will be looking for orders that contain mistakes in names,
addresses, quantities, or model numbers.
2. Locate copies of order forms from the past six months.
3. Examine some of the order forms and conclude that only about 5 percent (0.05) contain
errors.

160 161 162 163 164 165 166 167 168 169 170