Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Problems in maximising a likelihood function


From   Deepankar Basu <basu.15@osu.edu>
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: RE: Problems in maximising a likelihood function
Date   Wed, 14 Jun 2006 19:28:35 -0400

The dimensionality of my parameter space is 15; my sample is pretty
large. I have about 50,000 observations (with lot of variation in all
the regressors). Hence, I do not expect any problems on that count. Of
course, my model might be unidentified for some parameters (though I
don't think that is the case). I will think further about the
identifiability issue.

I have removed all temporary variables from the program now. I have used
scalars for all the temporary variables.

But I cannot clean up the expression for the log likelihood any further.
As I had said earlier, the log likelihood for each observation is of the
following form: ln(a+b); where 'a' and 'b' are huge expressions
involving exponentials, etc. I don't see how I can simplify it any
further; if you can suggest something, that will be great.

As far I can see, all the expressions in the re-worked likelihood
evaluator (given below) are well-behaved. I might be wrong, but let me
just go over each:

q1...q14 are defined in such a manner that they are always within 0 and
1, and sub-groups of them sum to 1. For instance, q1+q2=1; q3+q4+q5=1:
and so on.

x1 is bounded since ML_y1 lies between 2 and 6 in the dataset.
(Additionally, ML_y2...ML_y16 all lies between 0 and 1.)

x2 is just the factorial of ML_y1, and so it is bounded.

x3 is the product of x1 and x2.

x4 and x6 lie between 0 and 1; the limit of x4 is 0 when `theta1' goes
to + or - infinity, and the limit is (1/e) when `theta1' goes to 0.
Additionally, both are continuous functions of `theta1'. So, these
should not cause problems.

x5 is the normal cdf; hence it is always between 0 and 1.

So, the only way the log likelihood (i.e., `lnf') could blow up (to
-infinity) is if the expression to be logged comes very close to 0. But
if we start with initial values of 0 for all the parameters, we are
bounded away from 0 because of the first expression within the log
likelihood (which is (1/x3)*x4*x5). x3 is a finite number; x4 should be
the reciprocal of 'e' and x5 should be 0.5.

I don't see where the log likelihood could blow up. Any suggestions are
welcome. I give the re-worked program below.

___________________________________________________

clear
capture program drop gender1_lf
program gender1_lf
   version 8.1
   * set mem 100m
   #delimit ;
   args lnf theta1 theta2 theta3 
        theta5 theta6 theta7 theta8 theta9 
        theta10 theta11 theta12 theta13 theta14;
    #delimit cr


     
   quietly {
      scalar q1 = 1/(1 + exp(`theta5'))   
      scalar q2 = exp(`theta5')/(1 + exp(`theta5'))   

      scalar q3 = 1/(1 + exp(`theta6') + exp(`theta7'))   
      scalar q4 = exp(`theta6')/(1 + exp(`theta6') + exp(`theta7'))   
      scalar q5 = exp(`theta7')/(1 + exp(`theta6') + exp(`theta7'))   

      scalar q6 = 1/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))   
      scalar q7 = exp(`theta8')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))   
      scalar q8 = exp(`theta9')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))   
      scalar q9 = exp(`theta10')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))   

      scalar q10 = 1/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))   
      scalar q11 = exp(`theta11')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))   
      scalar q12 = exp(`theta12')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))   
      scalar q13 = exp(`theta13')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))   
      scalar q14 = exp(`theta14')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))   
      
      scalar x1 = 2^($ML_y1)
      scalar x2 = exp(lnfact($ML_y1))
      scalar x3 = x1*x2     
      scalar x4 = ((exp(`theta1'))^($ML_y1)/exp(exp(`theta1')))
      scalar x5 = norm(-`theta3')
      scalar x6 =
((exp(`theta1'+`theta2'))^($ML_y1)/exp(exp(`theta1'+`theta2')))

   }
      #delimit ;
      quietly replace `lnf' = ln(((1/x3)*x4*x5) + 
                                 ((1-x5)*x6*( $ML_y2$ML_y3*q1
+ $ML_y4*q2 + 
                                 $ML_y5*q3 + $ML_y6*q4 + $ML_y7*q5 + 
                                 $ML_y8*q6 + $ML_y9*q7+ $ML_y10*q8 +
$ML_y11*q9 + 
                                 $ML_y12*q10 + $ML_y13*q11 + $ML_y14*q12
+ 
                                 $ML_y15*q13 + $ML_y16*q14)));
      #delimit cr
   
end


use india2.dta

#delimit ;
ml model lf gender1_lf (one: dfsize p21 p31 p32 p41 p42 p43 p51 p52 p53
p54 = v012 v025, nocons) 
(two: ) 
(three: p61 p62 p63 p64 p65 = v012 v130, nocons) 
(four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) (eleven:)
(twelve:) (thirteen:), tech(bhhh nr);  

#delimit cr






____________________________________________________ 

On Wed, 2006-06-14 at 08:10 +0100, Nick Cox wrote:
> The limit you are approaching may just be that your 
> model is not well suited to your data. Very simply, 
> what is the dimensionality of your parameter space
> and are your data adequate for what you are trying
> to do? 
> 
> Nevertheless it seems that you have only partially acted on 
> the earlier comments. Glancing at the code again
> I seem to see expressions which are logarithms of 
> products, logarithms of ratios, logarithms 
> of exponentials, etc. so there may be much more
> tbat you can do by way of cleaning up the expression. 
> 
> Also, it looks very much as if your thetas 
> are all single-valued parameters, in which 
> case scalars rather than variables are appropriate
> for what you are calling q1 ... 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> Deepankar Basu
>  
> > Nick,
> > 
> > Thanks for your comments. I accept your point about a positive
> > likelihood. Actually, there was an error in one of the 
> > statements; I had
> > put a "+" in place of a "*". so, now I have negative values of the log
> > likelihood. But, I still don't have convergence.  
> > 
> > About calculating the log likelihood directly: in my case, the log
> > likelihood for each observation is the log of a sum (of two 
> > quantities).
> > Hence, I cannot simply things much using the log function 
> > beyond taking
> > the log of the whole expression. That is what I have done 
> > now. So, now I
> > am working with the log likelihood directly.
> > 
> > Additionally, I have removed some of the temporary variables 
> > that I was
> > using to define the log likelihood and have instead used the 
> > expressions
> > directly. Since some of the temporary variables (like x3, x4 
> > or x6) were
> > exp functions, there was a potential for them to become too large.
> > Hence, I have removed them. The temporary variables that I 
> > use now (x1,
> > x2 and x5 apart from q*) are all bounded.
> > 
> > Now, when I issue the - ml max - command (after ml check 
> > shows that all
> > is well), I get the following error:
> > 
> > initial:       log likelihood = -2841.7221
> > rescale:       log likelihood = -1595.8146
> > rescale eq:    log likelihood = -1452.1748
> > (setting optimization to BHHH)
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > could not calculate numerical derivatives
> > missing values encountered
> > r(430); 
> > 
> > Comments and suggestions are welcome. I give the re-worked do-file
> > below. Thanks,
> > Deepankar
> > 
> > ____________________________________________________________
> > 
> > clear
> > capture program drop gender1_lf
> > program gender1_lf
> >    version 8.1
> >    * set mem 100m
> >    #delimit ;
> >    args lnf theta1 theta2 theta3 
> >         theta5 theta6 theta7 theta8 theta9 
> >         theta10 theta11 theta12 theta13 theta14;
> > 
> >    tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
> >            x1 x2 x5;
> >    #delimit cr
> > 
> >    quietly {
> >       gen double `q1' = 1/(1+exp(`theta5'))   
> >       gen double `q2' = exp(`theta5')/(1+exp(`theta5'))   
> > 
> >       gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))   
> >       gen double `q4' = exp(`theta6')/(1
> > +exp(`theta6')+exp(`theta7'))   
> >       gen double `q5' = exp(`theta7')/(1
> > +exp(`theta6')+exp(`theta7'))   
> > 
> >       gen double `q6' = 1/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q7' = exp(`theta8')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q8' = exp(`theta9')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q9' = exp(`theta10')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > 
> >       gen double `q10' = 1/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       gen double `q11' = exp(`theta11')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       gen double `q12' = exp(`theta12')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       gen double `q13' = exp(`theta13')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       gen double `q14' = exp(`theta14')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       
> >       gen double `x1' = 2^(-$ML_y1)
> >       gen double `x2' = exp(lnfact($ML_y1))
> >       
> >       gen double `x5' = norm(-`theta3')
> >       
> > 
> >    }
> >       #delimit ;
> >       quietly replace `lnf' =
> > ln((`x1'/`x2')*(exp(-(exp(`theta1'))))*(exp($ML_y1*`theta1'))*`x5' + 
> > 
> > ((1-`x5')*(exp(-(exp(`theta2'))))*((exp($ML_y1*`theta2'))))*( $ML_y2 +
> > $ML_y3*`q1' + $ML_y4*`q2' + 
> >                                  $ML_y5*`q3' + $ML_y6*`q4'+ 
> > $ML_y7*`q5'
> > + 
> >                                  $ML_y8*`q6' + $ML_y9*`q7'+
> > $ML_y10*`q8'+$ML_y11*`q9' + 
> >                                  $ML_y12*`q10' + $ML_y13*`q11' +
> > $ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14'));
> >       #delimit cr
> >    
> > end
> > 
> > 
> > use india2.dta
> > 
> > #delimit ;
> > ml model lf gender1_lf (one: dfsize p21 p31 p32 = v012 v025, nocons) 
> > (two: p41 p42 p43 p51 p52 p53 p54 = v012 v025) 
> > (three: p61 p62 p63 p64 p65 = v012 v130, nocons) 
> > (four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) (eleven:)
> > (twelve:) (thirteen:), tech(bhhh nr);  
> > 
> > #delimit cr
> > 
> > 
> > 
> > _____________________________________________________________________
> > 
> > 
> > 
> > 
> > On Tue, 2006-06-13 at 17:46 +0100, Nick Cox wrote:
> > > I am not an -ml- expert, but I sense some basic
> > > misunderstandings. 
> > > 
> > > A positive likelihood is not problematic as such. 
> > > Although a probability cannot exceed 1, 
> > > this is not true of a probability density. Whenever 
> > > probability densities are typically above 1, their product
> > > will be too. 
> > > 
> > > However, your program calculates the likelihood directly, 
> > > before logging it, and the numbers become far too big to handle. 
> > > exp(35000) is big!!! 
> > > 
> > > Thus it is the likelihood that becomes missing, not your
> > > original data. Stata would just ignore any data missings. 
> > > 
> > > You must rewrite your program so that you 
> > > are working in terms of the log likelihood throughout. 
> > > 
> > > A side-issue with your program is that you appear to 
> > > be putting lots of constants into variables. Using
> > > scalars is much better style and less wasteful
> > > of storage. 
> > > 
> > > Nick 
> > > n.j.cox@durham.ac.uk 
> > > 
> > > DEEPANKAR BASU
> > > > 
> > > > I am facing problems while trying to do a maximum likelihood 
> > > > estimation. I have written out the likelihood estimator; then 
> > > > I have used "ml check". The "ml check" shows that my 
> > > > likelihood evaluator has passed all tests. Then when I issue 
> > > > the "ml max" command, I get the following error message: 
> > > > 
> > > > initial:       log likelihood =  31058.016
> > > > rescale:       log likelihood =  31058.016
> > > > rescale eq:    log likelihood =  34639.752
> > > > (setting optimization to BHHH)
> > > > could not calculate numerical derivatives
> > > > missing values encountered
> > > > r(430);
> > > > 
> > > > First thing that worries me is that the log likelihood is 
> > > > positive; is that a sign that my program has some problem? 
> > > > And second, the maximisation is interrupted before completion 
> > > > because it could not calculate numerical derivatives (missing 
> > > > values encountered). I re-checked my dataset; there are no 
> > > > missing values for the variables that I am using. Probably 
> > > > the error message means something else.
> > > > 
> > > > I have tried (1) switching between techniques BHHH and NR; 
> > > > (2) ml max, difficult. I still get the same error message 
> > > > (though the log likelihood values where the execution of the 
> > > > program stops are different in each case).
> > > > 
> > > > 
> > > > Below I give the do-file that has my likelihood evaluator 
> > > > program and also a "ml model ..." command (as suggested in 
> > > > the book by Gould, Pitbaldo and Sribney: page 165); any 
> > > > suggestions are welcome. 
> > > > 
> > > > If someone can detect some error immediately, that will be 
> > > > great. But, if someone needs to run the program, I can send a 
> > > > small subset of my dataset for the purpose. Thanks in advance.
> > > > 
> > > > Deepankar Basu
> > > > 
> > > > ____________________________________________
> > > > 
> > > > 
> > > > clear
> > > > capture program drop gender_lf
> > > > program gender_lf
> > > >    version 8.1
> > > >    set mem 100m
> > > >    #delimit ;
> > > >    args lnf theta1 theta2 theta3 
> > > >         theta5 theta6 theta7 theta8 theta9 
> > > >         theta10 theta11 theta12 theta13 theta14;
> > > > 
> > > >    tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
> > > >            x x1 x2 x3 x4 x5 x6 x7;
> > > >    #delimit cr
> > > > 
> > > >    quietly {
> > > >       gen double `q1' = 1/(1+exp(`theta5'))   
> > > >       gen double `q2' = exp(`theta5')/(1+exp(`theta5'))   
> > > > 
> > > >       gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))   
> > > >       gen double `q4' = 
> > > > exp(`theta6')/(1+exp(`theta6')+exp(`theta7'))   
> > > >       gen double `q5' = 
> > > > exp(`theta7')/(1+exp(`theta6')+exp(`theta7'))   
> > > > 
> > > >       gen double `q6' = 
> > > > 1/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > > >       gen double `q7' = 
> > > > exp(`theta8')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > > >       gen double `q8' = 
> > > > exp(`theta9')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > > >       gen double `q9' = 
> > > > exp(`theta10')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > > > 
> > > >       gen double `q10' = 
> > > > 1/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')+ 
> > exp(`theta14'))   
> > > >       gen double `q11' = 
> > > > exp(`theta11')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))   
> > > >       gen double `q12' = 
> > > > exp(`theta12')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))   
> > > >       gen double `q13' = 
> > > > exp(`theta13')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))   
> > > >       gen double `q14' = 
> > > > exp(`theta14')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))   
> > > >       
> > > >       gen double `x1' = 2^(-$ML_y1)
> > > >       gen double `x2' = exp(lnfact($ML_y1))
> > > >       gen double `x3' = exp(`theta1')
> > > >       gen double `x4' = exp($ML_y1*`theta1')
> > > >       gen double `x5' = norm(-`theta3')
> > > >       gen double `x6' = exp(`theta2')
> > > >       gen double `x7' = exp($ML_y1*`theta2')
> > > >       #delimit ;
> > > >       gen double  `x' = `x1'*(1/`x2')*exp(-`x3')*`x4'*`x5' + 
> > > > (1-`x5')*exp(-`x6')*`x7' + 
> > > >                          $ML_y2 + ($ML_y3*`q1' + $ML_y4*`q2') + 
> > > >                          ($ML_y5*`q3' + $ML_y6*`q4'+ 
> > $ML_y7*`q5') + 
> > > >                          ($ML_y8*`q6' + $ML_y9*`q7'+ 
> > > > $ML_y10*`q8'+$ML_y11*`q9') + 
> > > >                          ($ML_y12*`q10' + $ML_y13*`q11' + 
> > > > $ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14');
> > > >       #delimit cr
> > > > 
> > > >    }
> > > >    
> > > >    quietly replace `lnf' = ln(`x')
> > > > end
> > > > 
> > > > 
> > > > use india1.dta
> > > > 
> > > > #delimit ;
> > > > ml model lf gender_lf (one: dfsize p21 p31 p32 = v012 
> > v025, nocons) 
> > > > (two: p41 p42 p43 p51 p52 p53 p54 = v012 v025) 
> > > > (three: p61 p62 p63 p64 p65 = v012 v025 v130, nocons) 
> > > > (four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) 
> > > > (eleven:) (twelve:) (thirteen:), tech(bhhh nr);  
> > > > 
> > > > #delimit cr
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index