Page 484 - Discrete Mathematics and Its Applications
P. 484

7.2 Probability Theory 463


                                                     It follows that the probability that there is at least one collision, that is, at least two keys are
                                                     mapped to the same location, is

                                                                     m − 1 m − 2        m − n + 1
                                                        1 − p n = 1 −      ·      · ··· ·        .
                                                                      m       m            m

                                                     Techniques from calculus can be used to find the smallest value of n given a value of m such
                                                     that the probability of a collision is greater than a particular threshold. It can be shown that the
                                                     smallest integer n such that the probability of a collision is greater than 1/2 is approximately
                                                             √
                                                     n = 1.177 m. For example, when m = 1,000,000, the smallest integer n such that the proba-
                                                     bility of a collision is greater than 1/2 is 1178.                             ▲


                                                     Monte Carlo Algorithms

                                                     The algorithms discussed so far in this book are all deterministic. That is, each algorithm always
                                                     proceeds in the same way whenever given the same input. However, there are many situations
                                                     where we would like an algorithm to make a random choice at one or more steps. Such a
                                                     situation arises when a deterministic algorithm would have to go through a huge number, or even
                                                     an unknown number, of possible cases. Algorithms that make random choices at one or more
                                                     steps are called probabilistic algorithms. We will discuss a particular class of probabilistic
                                                     algorithms in this section, namely, Monte Carlo algorithms, for decision problems. Monte
                                                     Carlo algorithms always produce answers to problems, but a small probability remains that
                                                     these answers may be incorrect. However, the probability that the answer is incorrect decreases
                                                     rapidly when the algorithm carries out sufficient computation. Decision problems have either
                                                     “true” or “false” as their answer. The designation “Monte Carlo” is a reference to the famous
                                 Monte Carlo methods
                                 were invented to help  casino in Monaco; the use of randomness and the repetitive processes in these algorithms make
                                 develop the first nuclear  them similar to some gambling games. This name was introduced by the inventors of Monte
                                 weapons.            Carlo methods, including Stan Ulam, Enrico Fermi, and John von Neumann.
                                                        A Monte Carlo algorithm for a decision problem uses a sequence of tests. The probability
                                                     that the algorithm answers the decision problem correctly increases as more tests are carried
                                                     out. At each step of the algorithm, possible responses are “true,” which means that the answer
                                                     is “true” and no additional iterations are needed, or “unknown,” which means that the answer
                                                     could be either “true” or “false.” After running all the iterations in such an algorithm, the final
                                                     answer produced is “true” if at least one iteration yields the answer “true,” and the answer is
                                                     “false” if every iteration yields the answer “unknown.” If the correct answer is “false,” then the
                                                     algorithm answers “false,” because every iteration will yield “unknown.” However, if the correct
                                                     answer is “true,” then the algorithm could answer either “true” or “false,” because it may be
                                                     possible that each iteration produced the response “unknown” even though the correct response
                                                     was “true.” We will show that this possibility becomes extremely unlikely as the number of tests
                                                     increases.
                                                        Suppose that p is the probability that the response of a test is “true,” given that the answer
                                                     is “true.” It follows that 1−p is the probability that the response is “unknown,” given that the
                                                     answer is “true.” Because the algorithm answers “false” when all n iterations yield the answer
                                                                                                                                  n
                                                     “unknown” and the iterations perform independent tests, the probability of error is (1−p) .
                                                     When p  = 0, this probability approaches 0 as the number of tests increases. Consequently, the
                                                     probability that the algorithm answers “true” when the answer is “true” approaches 1.
                                     EXAMPLE 15      Quality Control (This example is adapted from [AhUl95].) Suppose that a manufacturer
                                                     orders processor chips in batches of size n, where n is a positive integer. The chip maker
                                                     has tested only some of these batches to make sure that all the chips in the batch are good
                                                     (replacing any bad chips found during testing with good ones). In previously untested batches,
                                                     the probability that a particular chip is bad has been observed to be 0.1 when random testing
                                                     is done. The PC manufacturer wants to decide whether all the chips in a batch are good. To
   479   480   481   482   483   484   485   486   487   488   489