|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Problem with marginal effects in zero truncated negative binomial
From |
Jos� Ignacio Ant�n <[email protected]> |
To |
<[email protected]> |
Subject |
st: Problem with marginal effects in zero truncated negative binomial |
Date |
Sun, 10 Aug 2008 12:58:19 +0200 |
Dear stata users,
I am using STATA 10 and it is updated.
I am analyzing several types of count data of health care. When I use
-poisson- and -nbreg- I have no problems in getting the coefficients and the
marginal effects with -mfx-.
The problem arises when I try to implement a two-part model. In the second
part, I have to run a zero truncated poisson -ztp- only for individuals with
more than 1 count but preferably a zero truncated negative binomial -ztnb-.
I have two problems implementing the -ztnb-:
1) In some cases the likelihood function fails to converge (and, after
carefully checking, this is not caused by collinearities).
2) When it converges (though in some steps the expression "not concave"
appears, which according to the book on ML of Gould et al (2006) is not
problematic unless it appears at the end), when I try to get the marginal
effects using -mfx- (or -prchange- of spost) I get an strange answer.
. xi: ztnb visitsGP gender age age2 chronic1 chronic2 accident smoker sport
i.immigrant2 couple working i.education i.income hhsize e
> nviroment doctors i.region i.town if visitsGP>0, robust
i.immigrant2 _Iimmigrant_0-2 (naturally coded; _Iimmigrant_0
omitted)
i.education _Ieducation_1-4 (naturally coded; _Ieducation_1
omitted)
i.income _Iincome_1-7 (naturally coded; _Iincome_1 omitted)
i.region _Iregion_1-18 (naturally coded; _Iregion_1 omitted)
i.town_size _Itown_size_1-7 (naturally coded; _Itown_size_1
omitted)
Fitting Zero-truncated poisson model:
Iteration 0: log pseudolikelihood = -8689.4121
Iteration 1: log pseudolikelihood = -7214.3645
Iteration 2: log pseudolikelihood = -7174.2588
Iteration 3: log pseudolikelihood = -7173.5565
Iteration 4: log pseudolikelihood = -7173.5563
Fitting constant-only model:
Iteration 0: log pseudolikelihood = -9340.1572
Iteration 1: log pseudolikelihood = -7954.1078 (not concave)
Iteration 2: log pseudolikelihood = -6943.3881
Iteration 3: log pseudolikelihood = -6862.141
Iteration 4: log pseudolikelihood = -6779.3102
Iteration 5: log pseudolikelihood = -6767.0463
Iteration 6: log pseudolikelihood = -6762.283
Iteration 7: log pseudolikelihood = -6760.9325
Iteration 8: log pseudolikelihood = -6760.6716
Iteration 9: log pseudolikelihood = -6760.6231
Iteration 10: log pseudolikelihood = -6760.6124
Iteration 11: log pseudolikelihood = -6760.6097
Iteration 12: log pseudolikelihood = -6760.6092
Iteration 13: log pseudolikelihood = -6760.6091
Iteration 14: log pseudolikelihood = -6760.6091
Fitting full model:
Iteration 0: log pseudolikelihood = -6478.5096
Iteration 1: log pseudolikelihood = -6451.0357
Iteration 2: log pseudolikelihood = -6451.0008
Iteration 3: log pseudolikelihood = -6451.0008
Zero-truncated negative binomial regression Number of obs =
8793
Dispersion = mean Wald chi2(44) =
.
Log likelihood = -6451.0008 Prob > chi2 =
.
----------------------------------------------------------------------------
--
| Robust
visitsGP | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
gender | .1320236 .0693939 1.90 0.057 -.003986
.2680331
age | -.0010105 .0108056 -0.09 0.925 -.0221892
.0201681
age2 | .0024984 .009758 0.26 0.798 -.0166269
.0216237
chronic1 | .2751715 .1460744 1.88 0.060 -.0111291
.5614721
chronic2 | 1.261197 .1477716 8.53 0.000 .9715704
1.550824
accident | .1831772 .0840681 2.18 0.029 .0184068
.3479477
smoker | -.0965631 .0797977 -1.21 0.226 -.2529636
.0598374
sport | -.1940175 .0669138 -2.90 0.004 -.3251662
-.0628688
_Iimmigran~1 | -.2546745 .2455856 -1.04 0.300 -.7360135
.2266644
_Iimmigran~2 | .070254 .1343127 0.52 0.601 -.1929941
.3335021
couple | -.0089753 .0745949 -0.12 0.904 -.1551787
.137228
working | -.0011818 .091396 -0.01 0.990 -.1803146
.1779511
_Ieducatio~2 | .093249 .091574 1.02 0.309 -.0862329
.2727308
_Ieducatio~3 | .1589886 .1153917 1.38 0.168 -.067175
.3851523
_Ieducatio~4 | .0681036 .1149247 0.59 0.553 -.1571447
.2933519
_Iincome_3 | .1822611 .1046566 1.74 0.082 -.022862
.3873842
_Iincome_4 | -.0165433 .103265 -0.16 0.873 -.2189391
.1858524
_Iincome_5 | -.064406 .1171721 -0.55 0.583 -.2940591
.1652472
_Iincome_6 | .0363746 .131031 0.28 0.781 -.2204414
.2931905
_Iincome_7 | -.2377055 .2239894 -1.06 0.289 -.6767166
.2013055
hhsize | .0183502 .0306031 0.60 0.549 -.0416308
.0783312
enviroment | .0548374 .0224128 2.45 0.014 .0109092
.0987657
doctors | .0322389 .0807495 0.40 0.690 -.1260271
.190505
_Iregion_2 | .4349085 .1725509 2.52 0.012 .096715
.773102
_Iregion_3 | -.5504036 .2498966 -2.20 0.028 -1.040192
-.0606152
_Iregion_4 | -.1382842 .1676026 -0.83 0.409 -.4667794
.1902109
_Iregion_5 | -.3033533 .1714646 -1.77 0.077 -.6394178
.0327112
_Iregion_6 | -.0138227 .2211614 -0.06 0.950 -.447291
.4196457
_Iregion_7 | -.2185485 .207345 -1.05 0.292 -.6249373
.1878403
_Iregion_8 | -.0180477 .1786581 -0.10 0.920 -.3682111
.3321157
_Iregion_9 | .1038768 .1960173 0.53 0.596 -.28031
.4880636
_Iregion_10 | .3059329 .1644643 1.86 0.063 -.0164111
.628277
_Iregion_11 | .6352331 .1794982 3.54 0.000 .2834232
.987043
_Iregion_12 | .2496245 .1486112 1.68 0.093 -.0416481
.5408972
_Iregion_13 | -.0800528 .2176679 -0.37 0.713 -.5066741
.3465685
_Iregion_14 | .0798718 .1437676 0.56 0.579 -.2019076
.3616511
_Iregion_15 | -.0130019 .2603814 -0.05 0.960 -.5233402
.4973363
_Iregion_16 | -.3708999 .2419471 -1.53 0.125 -.8451076
.1033077
_Iregion_17 | .0353583 .2251124 0.16 0.875 -.4058539
.4765705
_Iregion_18 | -.4824453 .2837713 -1.70 0.089 -1.038627
.0737363
_Itown_siz~2 | -.1070012 .1373188 -0.78 0.436 -.3761412
.1621388
_Itown_siz~3 | -.170247 .1302397 -1.31 0.191 -.425512
.085018
_Itown_siz~4 | -.3958154 .1658262 -2.39 0.017 -.7208289
-.070802
_Itown_siz~5 | -.1711893 .134986 -1.27 0.205 -.4357569
.0933783
_Itown_siz~6 | -.4062654 .181368 -2.24 0.025 -.7617402
-.0507906
_Itown_siz~7 | -.0472928 .2159213 -0.22 0.827 -.4704907
.3759051
_cons | -18.87778 .4562823 -41.37 0.000 -19.77208
-17.98348
-------------+--------------------------------------------------------------
--
/lnalpha | 17.70275 . .
.
-------------+--------------------------------------------------------------
--
alpha | 4.88e+07 . .
.
----------------------------------------------------------------------------
--
And then the marginal effects are extremely small (This happens for all
coefficients, I am reporting only some of them):
Marginal effects after ztnb
y = predicted number of events (predict)
= 1.403e-08
----------------------------------------------------------------------------
--
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+------------------------------------------------------------------
--
gender*| 1.82e-09 .00000 1.99 0.047 2.6e-11 3.6e-09
.66098
sport*| -2.78e-09 .00000 -2.95 0.003 -4.6e-09 -9.3e-10
.599909
_Iimmi~1*| -3.16e-09 .00000 -1.20 0.232 -8.3e-09 2.0e-09
.010804
_Iimmi~2*| 1.02e-09 .00000 0.51 0.613 -2.9e-09 5.0e-09
.048448
----------------------------------------------------------------------------
--
(*) dy/dx is for discrete change of dummy variable from 0 to 1
The same happens if I use with -prchange- of the ado -spost-
. prchange
ztnb: Changes in Unconditional Rate for visitsGP
min->max 0->1 -+1/2 -+sd/2
gender 0.0000 0.0000 0.0000 0.0000
age -0.0000 -0.0000 -0.0000 -0.0000
age2 0.0000 0.0000 0.0000 0.0000
chronic1 0.0000 0.0000 0.0000 0.0000
chronic2 0.0000 0.0000 0.0000 0.0000
accident 0.0000 0.0000 0.0000 0.0000
smoker -0.0000 -0.0000 -0.0000 -0.0000
sport -0.0000 -0.0000 -0.0000 -0.0000
_Iimmigran~1 -0.0000 -0.0000 -0.0000 -0.0000
_Iimmigran~2 0.0000 0.0000 0.0000 0.0000
I have not found until now a satisfactory response to my question. The only
thing I can figure out is that most of observations are 1 and there is very
little variability. They are more or less like that:
Monthly |
visits to |
GP | Freq. Percent Cum.
------------+-----------------------------------
1 | 7,065 79.40 79.40
2 | 1,150 12.92 92.32
3 | 310 3.48 95.81
4 | 278 3.12 98.93
5 | 36 0.40 99.34
6 | 13 0.15 99.48
7 | 8 0.09 99.57
8 | 13 0.15 99.72
9 | 4 0.04 99.76
10 | 7 0.08 99.84
12 | 4 0.04 99.89
14 | 2 0.02 99.91
15 | 3 0.03 99.94
20 | 5 0.06 100.00
------------+-----------------------------------
Total | 8,898 100.00
If I use the poisson distribution -ztp- results appear to be more logical.
Maybe because the poisson distribution is more parsimonious.
Marginal effects after ztp
y = predicted number of events (predict)
= .56415269
----------------------------------------------------------------------------
--
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+------------------------------------------------------------------
--
gender*| .0239736 .03491 0.69 0.492 -.04444 .092388
.66098
sport*| -.106718 .03247 -3.29 0.001 -.170357 -.043079
.599909
_Iimmi~1*| -.1111744 .09691 -1.15 0.251 -.301124 .078775
.010804
_Iimmi~2*| .0154857 .06213 0.25 0.803 -.106287 .137259
.048448
----------------------------------------------------------------------------
--
(*) dy/dx is for discrete change of dummy variable from 0 to 1
And when I use count data where there are much more variability, I have no
problem with convergence or marginal effects, for example data distributed
as follows:
visitshospi
tal Freq. Percent Cum.
1 459 18.53 18.53
2 307 12.39 30.92
3 252 10.17 41.10
4 153 6.18 47.27
5 139 5.61 52.89
6 70 2.83 55.71
7 126 5.09 60.80
8 143 5.77 66.57
9 69 2.79 69.36
10 111 4.48 73.84
11 48 1.94 75.78
12 49 1.98 77.76
13 18 0.73 78.48
14 64 2.58 81.07
15 96 3.88 84.94
16 26 1.05 85.99
17 33 1.33 87.32
18 17 0.69 88.01
19 14 0.57 88.57
20 34 1.37 89.95
21 16 0.65 90.59
22 34 1.37 91.97
23 16 0.65 92.61
24 10 0.40 93.02
25 7 0.28 93.30
26 8 0.32 93.62
27 7 0.28 93.90
28 8 0.32 94.23
29 9 0.36 94.59
30 33 1.33 95.92
31 1 0.04 95.96
32 9 0.36 96.33
33 1 0.04 96.37
34 4 0.16 96.53
35 4 0.16 96.69
36 4 0.16 96.85
37 10 0.40 97.25
38 2 0.08 97.34
39 1 0.04 97.38
40 2 0.08 97.46
41 2 0.08 97.54
42 2 0.08 97.62
43 3 0.12 97.74
44 3 0.12 97.86
45 4 0.16 98.02
47 3 0.12 98.14
50 3 0.12 98.26
51 1 0.04 98.30
52 5 0.20 98.51
54 3 0.12 98.63
55 1 0.04 98.67
56 1 0.04 98.71
57 1 0.04 98.75
58 2 0.08 98.83
60 6 0.24 99.07
65 1 0.04 99.11
66 1 0.04 99.15
67 2 0.08 99.23
70 1 0.04 99.27
75 1 0.04 99.31
76 1 0.04 99.35
77 1 0.04 99.39
83 1 0.04 99.43
84 1 0.04 99.48
90 5 0.20 99.68
93 1 0.04 99.72
111 2 0.08 99.80
150 1 0.04 99.84
180 2 0.08 99.92
213 1 0.04 99.96
232 1 0.04 100.00
Total 2,477 100.00
Could anyone know why this happen? Maybe it is because most of observations
are 1 and I am using a -ztnb-, which is not very parsimonious to estimate
and simply I should use a negative binomial regression with all data or
estimate a logit or a probit only for the probability of having one count.
Thank you very much,
Jos� Ignacio Ant�n
Department of Economics
University of Salamanca
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/