[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

# RE: st: Maximum Likelihood

 From "Kobes, Deborah" <[email protected]> To "'[email protected]'" <[email protected]> Subject RE: st: Maximum Likelihood Date Mon, 5 May 2003 11:14:10 -0400

```thanks so much - changing to normprob(-z) fixed everything!

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Monday, May 05, 2003 10:13 AM
To: [email protected]
Subject: Re: st: Maximum Likelihood

Deborah Kobes <[email protected]> is having problem estimating a model
with -ml-:

> I made those changes and still got missings with the 'no constant'
> specification [...]  I have read the Gould and Scribney book but haven't
> found anything about why the constant/no constant specification would
change
> whether the model works.
>
> Any further ideas would be much appreciated

Evidently, Deborah can estimate the model fine if a constant is included in
the model, but if she suppresses the constant, -ml- interates a bit then
reports

>       .         ml maximize;
>
>       initial:       log likelihood = -30971.235
>       alternative:   log likelihood = -21880.591
>       rescale:       log likelihood = -20492.416
>       rescale eq:    log likelihood = -20304.363
>       Iteration 0:   log likelihood = -20304.363
>       Iteration 1:   log likelihood =  -17187.65
>       Iteration 2:   log likelihood = -17036.272
>       Iteration 3:   log likelihood = -17034.944
>       Iteration 4:   log likelihood = -17034.944  (not concave)
>
>       [output follows with missing values for some standard errors]

For Deborah's information, models without an intercept are often more
difficult
to fit because, in a single parameter, all scaling problems are solved.
Something like.  Consider a term such as

ln(normprob(x*b))

The above calculation produces missing value if x*b <= -38 and, worse, it
not
all that accurate at -38 < x*b < -30.  Understand that normprob(-30) =
4.91e-198, and it is difficult to describe in words just what a small
probability that is.  Physcisists wonder if protons ultimately evaporate
but, if they do, the universe is not yet old enough that many would have.
That the kind of probably 4.91e-198 is.

If the parameters, during estimation, wonder into this area of the likelhood
function, problems are assured.  An intercept tends to solve all those
problems:

ln(normprob(a + x*b))

Now, no longer how absurd a set of b values, and even if x*b becomes very
large negatively, one parameter -- a -- tends to get set to keep the overall
linear combination a+x*b in a reasonable area of the function.  That is why
intercepts help so much.

In any case, when writing an evaluator for a likelihood function that will
be used without an intercept, extreme care must be taken in the numerical
evaluation.  The hill climber really does not want to go into these
extreme areas, but numerical inaccuracy can cause it to wander temporarily
off course.  There is exactly such a problem in Deborah's evaluator:

program define ml_move2
args lnf theta1 theta2;
quietly replace `lnf' =
ln( (1 - normprob(`theta1' + `theta2'))*probmax +
(1 - normprob(`theta2')*(1-probmax)) )
if \$ML_y1==1
quietly replace `lnf' =
ln(  normprob(`theta1' + `theta2')*probmax +
normprob(`theta2')* > (1-probmax) )
if \$ML_y1==0
end

The problem line is

ln( (1 - normprob(`theta1' + `theta2'))*probmax +
(1 - normprob(`theta2')*(1-probmax)) )
if \$ML_y1==1

and the particular parts in question are

(1 - normprob(`theta1' + `theta2'))*probmax
and
(1 - normprob(`theta2')*(1-probmax))

Never code 1-normprob(z).  Code instead normprob(-z), which is far more
accurate.  The problem line should read

ln( (normprob(-`theta1' - `theta2'))*probmax +
(normprob(-`theta2')*(1-probmax)) )
if \$ML_y1==1

With luck, this will solve Deborah's problem.

If not, then Deborah is going to have obtain better starting values, too,
and
here is how she can do that:

1.  Fit the model with an intercept.
She has two equations, and lets call the results a1+b1*x and
a2+b2*z.

2.  In the data, generate

. gen i1 = a1+b1*x
. gen i2 = a2+b2*z

. gen j1 = b1*x
. gen j2 = b2*z

3.  Obtain the means of each:

. summarize i1 j1

. summarize i2 j2

4.  Solve for constants c1 and c2 such that

mean(i1) = mean(j1)*c1

mean(i2) = mean(j2)*c2

5.  Use as the starting values c1*b1 and c2*b2 in estimation without
the intercept.

The above is an attempt to find decent starting values of the right scale
for the no-constant model.

I suspect, however, that finding better starting values will not be
necessary
once the program is fixed to be more accurate.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

 © Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index