Page 70 - Applied Probability
P. 70

53
                                 5. A family of discrete density functions p n (θ) defined on {0, 1,...} and
                                   indexed by a parameter θ> 0 is said to be a power-series family if
                                   for all n
                                                                       n
                                                                    c n θ
                                                                        ,
                                                                =
                                                          p n (θ) 3. Newton’s Method and Scoring  (3.18)
                                                                    g(θ)
                                                                 k=0 k θ is the appropriate normal-
                                   where c n ≥ 0 and where g(θ)=    ∞  c  k
                                   izing constant. If X 1 ,...,X m is a random sample from the discrete
                                   density (3.18) with observed values x 1 ,...,x m , then show that the
                                   maximum likelihood estimate of θ is a root of the equation
                                                           m
                                                        1  	         θg (θ)

                                                             x i  =       .
                                                        m             g(θ)
                                                          i=1
                                   Prove that the expected information in a single observation is
                                                                     2
                                                                   σ (θ)
                                                          J(θ)  =       ,
                                                                     θ 2
                                          2
                                   where σ (θ) is the variance of the density (3.18).
                                 6. Let the m independent random variables X 1 ,...,X m be normally
                                                                           2
                                   distributed with means µ i (θ) and variances σ /w i , where the w i are
                                   known constants. From observed values X 1 = x 1 ,...,X m = x m , one
                                   can estimate the mean parameters θ and the variance parameter σ 2
                                   simultaneously by the scoring algorithm. Prove that scoring updates
                                   θ by
                                          θ n+1                                           (3.19)
                                                 m                     m
                                                	           t       −1 	                     t
                                       = θ n +     w i dµ i (θ n ) dµ i (θ n )  w i [x i − µ i (θ n )]dµ i (θ n )
                                                i=1                    i=1
                                         2
                                   and σ by
                                                               m
                                                             1  	              2
                                                    2
                                                  σ n+1  =        w i [x i − µ i (θ n )] .
                                                            m
                                                               i=1
                                   In the least-squares literature, the scoring update of θ is better known
                                   as the Gauss-Newton algorithm.
                                 7. In the Gauss-Newton algorithm (3.19), the matrix

                                                       m
                                                      	           t
                                                         w i dµ i (θ n ) dµ i (θ n )
                                                      i=1
   65   66   67   68   69   70   71   72   73   74   75