# st: truncreg problem and the reasons

 From "Zou Hong" To Subject st: truncreg problem and the reasons Date Sun, 22 Jul 2007 16:02:11 +0800

Dear lister,

I am investigating the insurance consumption issue using Cragg's (1971) model, which is a first-stage probit plus a second-stage truncated model at zero (using only firms buying insurance). My sample is quite big around 60,000 observations rougly with 50% of firms buying insurance.

In fitting the simple truncated model, the estimation fails to converge, see message below:

truncreg ins1 size1 tang, ll(0)
(note: 34659 obs. truncated)

Fitting full model:

Iteration 0: log likelihood = 75318.141
Iteration 1: log likelihood = 77212.052
Iteration 2: log likelihood = 78769.215
Iteration 3: log likelihood = 79118.637 (backed up)
Iteration 4: log likelihood = 79289.237 (backed up)
Iteration 5: log likelihood = 79373.336 (backed up)
Iteration 6: log likelihood = 79415.173 (backed up)
Iteration 7: log likelihood = 79416.478 (backed up)
Iteration 8: log likelihood = 79417.13 (backed up)
Iteration 9: log likelihood = 79417.456 (backed up)
Iteration 10: log likelihood = 79417.619 (backed up)
Iteration 11: log likelihood = 79417.701 (backed up)
Iteration 12: log likelihood = 79417.741 (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 13: log likelihood = 79417.762 (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 14: log likelihood = 79417.772 (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 15: log likelihood = 79417.775 (backed up)
numerical derivatives are approximate
nearby values are missing
Iteration 16: log likelihood = 79417.777 (not concave)
numerical derivatives are approximate
nearby values are missing
Iteration 17: log likelihood = 79417.777
numerical derivatives are approximate
nearby values are missing
numerical derivatives are approximate
nearby values are missing
Iteration 18: log likelihood = 79417.779 (not concave)
numerical derivatives are approximate
nearby values are missing
numerical derivatives are approximate
nearby values are missing
Iteration 19: log likelihood = 79417.779 (not concave)
numerical derivatives are approximate
nearby values are missing
could not calculate numerical derivatives
missing values encountered
r(430);

I then find "ins1", the dependent variable defined as insurance expense/total assets, is higly skewed with a high kurtosis (see descriptive statistics below). I suspect this is the source of the problem. To mitigate the skewness, I create a variable "lnins1" (= ln (1+ins1*1000)) that is truncated at 0. I multiply ins1 by 1000 since ins1 is a very small ratio variable. I then reestimated the truncated variable and the model did converge (see below).

. tabstat ins1 ins2 lnins* ,stats(mean median sd min max sk kurtosis n ) casewise

stats | ins1 ins2 lnins0 lnins1
---------+----------------------------------------
mean | .0029195 .0030478 1.730176 .6079098
p50 | 0 0 0 0
sd | .0147884 .014874 2.216399 .9260294
min | 0 0 0 0
max | .7857143 .7407407 12.05861 6.667865
skewness | 23.16256 21.60987 .8706521 1.712282
kurtosis | 796.7301 694.4615 2.549835 5.961717
N | 60543 60543 60543 60543
--------------------------------------------------

. truncreg lnins1 size1 tang, ll(0)
(note: 34659 obs. truncated)

Fitting full model:

Iteration 0: log likelihood = -29082.919
Iteration 1: log likelihood = -27996.755
Iteration 2: log likelihood = -27960.392
Iteration 3: log likelihood = -27960.347
Iteration 4: log likelihood = -27960.347

Truncated regression
Limit: lower = 0 Number of obs = 25884
upper = +inf Wald chi2(2) =6320.92
Log likelihood = -27960.347 Prob > chi2 = .0000

---------------------------------------------------------
lnins1 | Coef. Std. Err. z P>|z|
---------------------------------------------------------
eq1 |
size1 | -.431 .006154 -70.04
tang | -1.843424 .0377302 -48.86
_cons | 6.314259 .0626349 100.81
---------------------------------------------------------
sigma |
_cons | .966854 .0065755 147.04 0.000
------------------------------------------------------

I wonder whether my above transformation makes a sense. I think it does preserve the interpretation of the direction of independent variables on "ins1".

Any and suggestions comments are welcome
Joe

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/