|  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: truncreg problem and the reasons
Dear lister,
I am investigating the insurance consumption issue using Cragg's (1971) 
model, which is a first-stage probit plus a second-stage truncated model at 
zero (using only firms buying insurance). My sample is quite big around 
60,000 observations rougly with 50% of firms buying insurance.
In fitting the simple truncated model, the estimation fails to converge, see 
message below:
truncreg ins1 size1 tang, ll(0)
(note: 34659 obs. truncated)
Fitting full model:
Iteration 0:   log likelihood =  75318.141
Iteration 1:   log likelihood =  77212.052
Iteration 2:   log likelihood =  78769.215
Iteration 3:   log likelihood =  79118.637  (backed up)
Iteration 4:   log likelihood =  79289.237  (backed up)
Iteration 5:   log likelihood =  79373.336  (backed up)
Iteration 6:   log likelihood =  79415.173  (backed up)
Iteration 7:   log likelihood =  79416.478  (backed up)
Iteration 8:   log likelihood =   79417.13  (backed up)
Iteration 9:   log likelihood =  79417.456  (backed up)
Iteration 10:  log likelihood =  79417.619  (backed up)
Iteration 11:  log likelihood =  79417.701  (backed up)
Iteration 12:  log likelihood =  79417.741  (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 13:  log likelihood =  79417.762  (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 14:  log likelihood =  79417.772  (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 15:  log likelihood =  79417.775  (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 16:  log likelihood =  79417.777  (not concave)
numerical derivatives are approximate
nearby values are missing
Iteration 17:  log likelihood =  79417.777
numerical derivatives are approximate
nearby values are missing
numerical derivatives are approximate
nearby values are missing
Iteration 18:  log likelihood =  79417.779  (not concave)
numerical derivatives are approximate
nearby values are missing
numerical derivatives are approximate
nearby values are missing
Iteration 19:  log likelihood =  79417.779  (not concave)
numerical derivatives are approximate
nearby values are missing
could not calculate numerical derivatives
missing values encountered
r(430);
I then find "ins1", the dependent variable defined as insurance 
expense/total assets, is higly skewed with a high kurtosis (see descriptive 
statistics below). I suspect this is the source of the problem. To mitigate 
the skewness, I create a variable "lnins1" (= ln (1+ins1*1000)) that is 
truncated at 0. I multiply ins1 by 1000 since ins1 is a very small ratio 
variable. I then reestimated the truncated variable and the model did 
converge (see below).
. tabstat ins1 ins2 lnins* ,stats(mean median sd min max sk kurtosis n ) 
casewise
  stats |      ins1      ins2    lnins0    lnins1
---------+----------------------------------------
   mean |  .0029195  .0030478  1.730176  .6079098
    p50 |         0         0         0         0
     sd |  .0147884   .014874  2.216399  .9260294
    min |         0         0         0         0
    max |  .7857143  .7407407  12.05861  6.667865
skewness |  23.16256  21.60987  .8706521  1.712282
kurtosis |  796.7301  694.4615  2.549835  5.961717
      N |     60543     60543     60543     60543
--------------------------------------------------
. truncreg lnins1 size1 tang, ll(0)
(note: 34659 obs. truncated)
Fitting full model:
Iteration 0:   log likelihood = -29082.919
Iteration 1:   log likelihood = -27996.755
Iteration 2:   log likelihood = -27960.392
Iteration 3:   log likelihood = -27960.347
Iteration 4:   log likelihood = -27960.347
Truncated regression
Limit:   lower =          0                             Number of obs = 
25884
        upper =       +inf                             Wald chi2(2) 
=6320.92
Log likelihood = -27960.347         Prob > chi2   = .0000
---------------------------------------------------------
     lnins1 |      Coef.   Std. Err.      z    P>|z|
---------------------------------------------------------
eq1          |
      size1 |      -.431    .006154   -70.04
       tang |  -1.843424   .0377302   -48.86
      _cons |   6.314259   .0626349   100.81
---------------------------------------------------------
sigma        |
      _cons |    .966854   .0065755   147.04   0.000
------------------------------------------------------
I wonder whether my above transformation makes a sense. I think it does 
preserve the interpretation of the direction of independent variables on 
"ins1".
Any and suggestions comments are welcome
Joe
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/