[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: RE: Problems in maximising a likelihood function

From	Deepankar Basu <[email protected]>
To	[email protected]
Subject	Re: st: RE: Problems in maximising a likelihood function
Date	Tue, 13 Jun 2006 16:49:18 -0400
Nick,

Thanks for your comments. I accept your point about a positive
likelihood. Actually, there was an error in one of the statements; I had
put a "+" in place of a "*". so, now I have negative values of the log
likelihood. But, I still don't have convergence.  

About calculating the log likelihood directly: in my case, the log
likelihood for each observation is the log of a sum (of two quantities).
Hence, I cannot simply things much using the log function beyond taking
the log of the whole expression. That is what I have done now. So, now I
am working with the log likelihood directly.

Additionally, I have removed some of the temporary variables that I was
using to define the log likelihood and have instead used the expressions
directly. Since some of the temporary variables (like x3, x4 or x6) were
exp functions, there was a potential for them to become too large.
Hence, I have removed them. The temporary variables that I use now (x1,
x2 and x5 apart from q*) are all bounded.

Now, when I issue the - ml max - command (after ml check shows that all
is well), I get the following error:

initial:       log likelihood = -2841.7221
rescale:       log likelihood = -1595.8146
rescale eq:    log likelihood = -1452.1748
(setting optimization to BHHH)
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered
could not calculate numerical derivatives
missing values encountered
r(430); 

Comments and suggestions are welcome. I give the re-worked do-file
below. Thanks,
Deepankar

____________________________________________________________

clear
capture program drop gender1_lf
program gender1_lf
   version 8.1
   * set mem 100m
   #delimit ;
   args lnf theta1 theta2 theta3 
        theta5 theta6 theta7 theta8 theta9 
        theta10 theta11 theta12 theta13 theta14;

   tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
           x1 x2 x5;
   #delimit cr

   quietly {
      gen double `q1' = 1/(1+exp(`theta5'))   
      gen double `q2' = exp(`theta5')/(1+exp(`theta5'))   

      gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))   
      gen double `q4' = exp(`theta6')/(1
+exp(`theta6')+exp(`theta7'))   
      gen double `q5' = exp(`theta7')/(1
+exp(`theta6')+exp(`theta7'))   

      gen double `q6' = 1/(1
+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
      gen double `q7' = exp(`theta8')/(1
+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
      gen double `q8' = exp(`theta9')/(1
+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
      gen double `q9' = exp(`theta10')/(1
+exp(`theta8')+exp(`theta9')+exp(`theta10'))   

      gen double `q10' = 1/(1
+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
      gen double `q11' = exp(`theta11')/(1
+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
      gen double `q12' = exp(`theta12')/(1
+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
      gen double `q13' = exp(`theta13')/(1
+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
      gen double `q14' = exp(`theta14')/(1
+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
      
      gen double `x1' = 2^(-$ML_y1)
      gen double `x2' = exp(lnfact($ML_y1))
      
      gen double `x5' = norm(-`theta3')
      

   }
      #delimit ;
      quietly replace `lnf' =
ln((`x1'/`x2')*(exp(-(exp(`theta1'))))*(exp($ML_y1*`theta1'))*`x5' + 

((1-`x5')*(exp(-(exp(`theta2'))))*((exp($ML_y1*`theta2'))))*( $ML_y2 +
$ML_y3*`q1' + $ML_y4*`q2' + 
                                 $ML_y5*`q3' + $ML_y6*`q4'+ $ML_y7*`q5'
+ 
                                 $ML_y8*`q6' + $ML_y9*`q7'+
$ML_y10*`q8'+$ML_y11*`q9' + 
                                 $ML_y12*`q10' + $ML_y13*`q11' +
$ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14'));
      #delimit cr
   
end


use india2.dta

#delimit ;
ml model lf gender1_lf (one: dfsize p21 p31 p32 = v012 v025, nocons) 
(two: p41 p42 p43 p51 p52 p53 p54 = v012 v025) 
(three: p61 p62 p63 p64 p65 = v012 v130, nocons) 
(four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) (eleven:)
(twelve:) (thirteen:), tech(bhhh nr);  

#delimit cr



_____________________________________________________________________




On Tue, 2006-06-13 at 17:46 +0100, Nick Cox wrote:
> I am not an -ml- expert, but I sense some basic
> misunderstandings. 
> 
> A positive likelihood is not problematic as such. 
> Although a probability cannot exceed 1, 
> this is not true of a probability density. Whenever 
> probability densities are typically above 1, their product
> will be too. 
> 
> However, your program calculates the likelihood directly, 
> before logging it, and the numbers become far too big to handle. 
> exp(35000) is big!!! 
> 
> Thus it is the likelihood that becomes missing, not your
> original data. Stata would just ignore any data missings. 
> 
> You must rewrite your program so that you 
> are working in terms of the log likelihood throughout. 
> 
> A side-issue with your program is that you appear to 
> be putting lots of constants into variables. Using
> scalars is much better style and less wasteful
> of storage. 
> 
> Nick 
> [email protected] 
> 
> DEEPANKAR BASU
> > 
> > I am facing problems while trying to do a maximum likelihood 
> > estimation. I have written out the likelihood estimator; then 
> > I have used "ml check". The "ml check" shows that my 
> > likelihood evaluator has passed all tests. Then when I issue 
> > the "ml max" command, I get the following error message: 
> > 
> > initial:       log likelihood =  31058.016
> > rescale:       log likelihood =  31058.016
> > rescale eq:    log likelihood =  34639.752
> > (setting optimization to BHHH)
> > could not calculate numerical derivatives
> > missing values encountered
> > r(430);
> > 
> > First thing that worries me is that the log likelihood is 
> > positive; is that a sign that my program has some problem? 
> > And second, the maximisation is interrupted before completion 
> > because it could not calculate numerical derivatives (missing 
> > values encountered). I re-checked my dataset; there are no 
> > missing values for the variables that I am using. Probably 
> > the error message means something else.
> > 
> > I have tried (1) switching between techniques BHHH and NR; 
> > (2) ml max, difficult. I still get the same error message 
> > (though the log likelihood values where the execution of the 
> > program stops are different in each case).
> > 
> > 
> > Below I give the do-file that has my likelihood evaluator 
> > program and also a "ml model ..." command (as suggested in 
> > the book by Gould, Pitbaldo and Sribney: page 165); any 
> > suggestions are welcome. 
> > 
> > If someone can detect some error immediately, that will be 
> > great. But, if someone needs to run the program, I can send a 
> > small subset of my dataset for the purpose. Thanks in advance.
> > 
> > Deepankar Basu
> > 
> > ____________________________________________
> > 
> > 
> > clear
> > capture program drop gender_lf
> > program gender_lf
> >    version 8.1
> >    set mem 100m
> >    #delimit ;
> >    args lnf theta1 theta2 theta3 
> >         theta5 theta6 theta7 theta8 theta9 
> >         theta10 theta11 theta12 theta13 theta14;
> > 
> >    tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
> >            x x1 x2 x3 x4 x5 x6 x7;
> >    #delimit cr
> > 
> >    quietly {
> >       gen double `q1' = 1/(1+exp(`theta5'))   
> >       gen double `q2' = exp(`theta5')/(1+exp(`theta5'))   
> > 
> >       gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))   
> >       gen double `q4' = 
> > exp(`theta6')/(1+exp(`theta6')+exp(`theta7'))   
> >       gen double `q5' = 
> > exp(`theta7')/(1+exp(`theta6')+exp(`theta7'))   
> > 
> >       gen double `q6' = 
> > 1/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q7' = 
> > exp(`theta8')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q8' = 
> > exp(`theta9')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> >       gen double `q9' = 
> > exp(`theta10')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))   
> > 
> >       gen double `q10' = 
> > 1/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))   
> >       gen double `q11' = 
> > exp(`theta11')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > + exp(`theta14'))   
> >       gen double `q12' = 
> > exp(`theta12')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > + exp(`theta14'))   
> >       gen double `q13' = 
> > exp(`theta13')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > + exp(`theta14'))   
> >       gen double `q14' = 
> > exp(`theta14')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > + exp(`theta14'))   
> >       
> >       gen double `x1' = 2^(-$ML_y1)
> >       gen double `x2' = exp(lnfact($ML_y1))
> >       gen double `x3' = exp(`theta1')
> >       gen double `x4' = exp($ML_y1*`theta1')
> >       gen double `x5' = norm(-`theta3')
> >       gen double `x6' = exp(`theta2')
> >       gen double `x7' = exp($ML_y1*`theta2')
> >       #delimit ;
> >       gen double  `x' = `x1'*(1/`x2')*exp(-`x3')*`x4'*`x5' + 
> > (1-`x5')*exp(-`x6')*`x7' + 
> >                          $ML_y2 + ($ML_y3*`q1' + $ML_y4*`q2') + 
> >                          ($ML_y5*`q3' + $ML_y6*`q4'+ $ML_y7*`q5') + 
> >                          ($ML_y8*`q6' + $ML_y9*`q7'+ 
> > $ML_y10*`q8'+$ML_y11*`q9') + 
> >                          ($ML_y12*`q10' + $ML_y13*`q11' + 
> > $ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14');
> >       #delimit cr
> > 
> >    }
> >    
> >    quietly replace `lnf' = ln(`x')
> > end
> > 
> > 
> > use india1.dta
> > 
> > #delimit ;
> > ml model lf gender_lf (one: dfsize p21 p31 p32 = v012 v025, nocons) 
> > (two: p41 p42 p43 p51 p52 p53 p54 = v012 v025) 
> > (three: p61 p62 p63 p64 p65 = v012 v025 v130, nocons) 
> > (four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) 
> > (eleven:) (twelve:) (thirteen:), tech(bhhh nr);  
> > 
> > #delimit cr
> > 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- st: RE: Problems in maximising a likelihood function
  - From: "Nick Cox" <[email protected]>
Prev by Date: [no subject]
Next by Date: Re: st: Obtaining Predicted Probabilities using gologit
Previous by thread: st: Re: panel data management: columns to rows
Next by thread: RE: st: RE: Problems in maximising a likelihood function
Index(es):
- Date
- Thread