Page 189 - Applied Probability
P. 189

9. Descent Graph Methods
                              174
                              is also implicit in definition (9.5). Now suppose without loss of generality
                              that the fraction
                                                         π j q ji
                                                                ≤ 1
                                                         π i q ij
                              for some j  = i. Then detailed balance follows immediately from
                                                                     π j q ji
                                                    π i q ij a ij  = π i q ij
                                                                     π i q ij
                                                             = π j q ji
                                                             = π j q ji a ji .
                                The Gibbs sampler is a special case of the Hastings-Metropolis algo-
                              rithm for Cartesian product state spaces [7, 8, 10]. Suppose that each
                              sample point i =(i 1 ,... ,i m ) has m components. For instance, i c could
                              represent the genotype of person c in a pedigree of m people. The Gibbs
                              sampler updates one component of i at a time. If the component is cho-
                              sen randomly and resampled conditional on the remaining components,
                              then the acceptance probability is 1. To prove this assertion, let i c be the
                              uniformly chosen component, and denote the remaining components by
                              i −c =(i 1 ,... ,i c−1 ,i c+1,...,i m). If j is a neighbor of i reachable by chang-
                              ing only component i c, then j −c = i −c . Hence, the proposal probability

                                                           1       π j
                                                       =
                                                   q ij
                                                           m             π k
                                                               {k:k −c =i −c }
                              satisfies π i q ij = π j q ji , and the ratio appearing in the acceptance probability
                              (9.5) is 1. In the location score application discussed in this chapter, the
                              Gibbs sampler leads to chains that either mix too slowly or are reducible.
                              For this reason, the general Metropolis algorithm is preferable.
                                In simulated annealing we are interested in finding the most probable
                              state of a Markov chain [17, 31]. If this state is k, then π k >π i for all i  = k.
                              To accentuate the weight given to state k, we can replace the equilibrium
                              distribution π by a distribution putting probability

                                                                   1
                                                        (τ)       π i τ
                                                       π     =
                                                        i            1

                                                                    π  τ
                                                                  j  j
                              on state i. Here τ is a small, positive parameter traditionally called temper-
                                                                                             (τ)
                              ature. For a chain with symmetric proposal density, the distribution π
                                                                                             i
                              can be attained by running the chain with acceptance probability
                                                                        1 !
                                                                    π j  τ
                                                        =min 1,           .                (9.7)
                                                   a ij
                                                                    π i
                              In fact, what is done in simulated annealing is that the chain is run with
                              τ gradually decreasing to 0. If τ starts out large, then in the early steps
   184   185   186   187   188   189   190   191   192   193   194