Page 70 - Applied Probability
P. 70
53
5. A family of discrete density functions p n (θ) defined on {0, 1,...} and
indexed by a parameter θ> 0 is said to be a power-series family if
for all n
n
c n θ
,
=
p n (θ) 3. Newton’s Method and Scoring (3.18)
g(θ)
k=0 k θ is the appropriate normal-
where c n ≥ 0 and where g(θ)= ∞ c k
izing constant. If X 1 ,...,X m is a random sample from the discrete
density (3.18) with observed values x 1 ,...,x m , then show that the
maximum likelihood estimate of θ is a root of the equation
m
1 θg (θ)
x i = .
m g(θ)
i=1
Prove that the expected information in a single observation is
2
σ (θ)
J(θ) = ,
θ 2
2
where σ (θ) is the variance of the density (3.18).
6. Let the m independent random variables X 1 ,...,X m be normally
2
distributed with means µ i (θ) and variances σ /w i , where the w i are
known constants. From observed values X 1 = x 1 ,...,X m = x m , one
can estimate the mean parameters θ and the variance parameter σ 2
simultaneously by the scoring algorithm. Prove that scoring updates
θ by
θ n+1 (3.19)
m m
t −1 t
= θ n + w i dµ i (θ n ) dµ i (θ n ) w i [x i − µ i (θ n )]dµ i (θ n )
i=1 i=1
2
and σ by
m
1 2
2
σ n+1 = w i [x i − µ i (θ n )] .
m
i=1
In the least-squares literature, the scoring update of θ is better known
as the Gauss-Newton algorithm.
7. In the Gauss-Newton algorithm (3.19), the matrix
m
t
w i dµ i (θ n ) dµ i (θ n )
i=1