[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Problem with marginal effects in zero truncated negative binomial

From	Jos� Ignacio Ant�n <[email protected]>
To	<[email protected]>
Subject	st: Problem with marginal effects in zero truncated negative binomial
Date	Sun, 10 Aug 2008 12:58:19 +0200
Dear stata users,

I am using STATA 10 and it is updated.

I am analyzing several types of count data of health care. When I use
-poisson- and -nbreg- I have no problems in getting the coefficients and the
marginal effects with -mfx-.

The problem arises when I try to implement a two-part model. In the second
part, I have to run a zero truncated poisson -ztp- only for individuals with
more than 1 count but preferably a zero truncated negative binomial -ztnb-. 

I have two problems implementing the -ztnb-:
1) In some cases the likelihood function fails to converge (and, after
carefully checking, this is not caused by collinearities).
2) When it converges (though in some steps the expression "not concave"
appears, which according to the book on ML of Gould et al (2006) is not
problematic unless it appears at the end), when I try to get the marginal
effects using -mfx- (or -prchange- of spost) I get an strange answer.

. xi: ztnb visitsGP gender age age2 chronic1 chronic2 accident smoker sport
i.immigrant2 couple working i.education i.income hhsize e
> nviroment doctors i.region i.town if visitsGP>0, robust
i.immigrant2      _Iimmigrant_0-2     (naturally coded; _Iimmigrant_0
omitted)
i.education       _Ieducation_1-4     (naturally coded; _Ieducation_1
omitted)
i.income          _Iincome_1-7        (naturally coded; _Iincome_1 omitted)
i.region          _Iregion_1-18       (naturally coded; _Iregion_1 omitted)
i.town_size       _Itown_size_1-7     (naturally coded; _Itown_size_1
omitted)

Fitting Zero-truncated poisson model:

Iteration 0:   log pseudolikelihood = -8689.4121  
Iteration 1:   log pseudolikelihood = -7214.3645  
Iteration 2:   log pseudolikelihood = -7174.2588  
Iteration 3:   log pseudolikelihood = -7173.5565  
Iteration 4:   log pseudolikelihood = -7173.5563  

Fitting constant-only model:

Iteration 0:   log pseudolikelihood = -9340.1572  
Iteration 1:   log pseudolikelihood = -7954.1078  (not concave)
Iteration 2:   log pseudolikelihood = -6943.3881  
Iteration 3:   log pseudolikelihood =  -6862.141  
Iteration 4:   log pseudolikelihood = -6779.3102  
Iteration 5:   log pseudolikelihood = -6767.0463  
Iteration 6:   log pseudolikelihood =  -6762.283  
Iteration 7:   log pseudolikelihood = -6760.9325  
Iteration 8:   log pseudolikelihood = -6760.6716  
Iteration 9:   log pseudolikelihood = -6760.6231  
Iteration 10:  log pseudolikelihood = -6760.6124  
Iteration 11:  log pseudolikelihood = -6760.6097  
Iteration 12:  log pseudolikelihood = -6760.6092  
Iteration 13:  log pseudolikelihood = -6760.6091  
Iteration 14:  log pseudolikelihood = -6760.6091  

Fitting full model:

Iteration 0:   log pseudolikelihood = -6478.5096  
Iteration 1:   log pseudolikelihood = -6451.0357  
Iteration 2:   log pseudolikelihood = -6451.0008  
Iteration 3:   log pseudolikelihood = -6451.0008  

Zero-truncated negative binomial regression       Number of obs   =
8793
Dispersion     = mean                             Wald chi2(44)   =
.
Log likelihood = -6451.0008                       Prob > chi2     =
.

----------------------------------------------------------------------------
--
             |               Robust
    visitsGP |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
      gender |   .1320236   .0693939     1.90   0.057     -.003986
.2680331
         age |  -.0010105   .0108056    -0.09   0.925    -.0221892
.0201681
        age2 |   .0024984    .009758     0.26   0.798    -.0166269
.0216237
    chronic1 |   .2751715   .1460744     1.88   0.060    -.0111291
.5614721
    chronic2 |   1.261197   .1477716     8.53   0.000     .9715704
1.550824
    accident |   .1831772   .0840681     2.18   0.029     .0184068
.3479477
      smoker |  -.0965631   .0797977    -1.21   0.226    -.2529636
.0598374
       sport |  -.1940175   .0669138    -2.90   0.004    -.3251662
-.0628688
_Iimmigran~1 |  -.2546745   .2455856    -1.04   0.300    -.7360135
.2266644
_Iimmigran~2 |    .070254   .1343127     0.52   0.601    -.1929941
.3335021
      couple |  -.0089753   .0745949    -0.12   0.904    -.1551787
.137228
     working |  -.0011818    .091396    -0.01   0.990    -.1803146
.1779511
_Ieducatio~2 |    .093249    .091574     1.02   0.309    -.0862329
.2727308
_Ieducatio~3 |   .1589886   .1153917     1.38   0.168     -.067175
.3851523
_Ieducatio~4 |   .0681036   .1149247     0.59   0.553    -.1571447
.2933519
  _Iincome_3 |   .1822611   .1046566     1.74   0.082     -.022862
.3873842
  _Iincome_4 |  -.0165433    .103265    -0.16   0.873    -.2189391
.1858524
  _Iincome_5 |   -.064406   .1171721    -0.55   0.583    -.2940591
.1652472
  _Iincome_6 |   .0363746    .131031     0.28   0.781    -.2204414
.2931905
  _Iincome_7 |  -.2377055   .2239894    -1.06   0.289    -.6767166
.2013055
      hhsize |   .0183502   .0306031     0.60   0.549    -.0416308
.0783312
  enviroment |   .0548374   .0224128     2.45   0.014     .0109092
.0987657
     doctors |   .0322389   .0807495     0.40   0.690    -.1260271
.190505
  _Iregion_2 |   .4349085   .1725509     2.52   0.012      .096715
.773102
  _Iregion_3 |  -.5504036   .2498966    -2.20   0.028    -1.040192
-.0606152
  _Iregion_4 |  -.1382842   .1676026    -0.83   0.409    -.4667794
.1902109
  _Iregion_5 |  -.3033533   .1714646    -1.77   0.077    -.6394178
.0327112
  _Iregion_6 |  -.0138227   .2211614    -0.06   0.950     -.447291
.4196457
  _Iregion_7 |  -.2185485    .207345    -1.05   0.292    -.6249373
.1878403
  _Iregion_8 |  -.0180477   .1786581    -0.10   0.920    -.3682111
.3321157
  _Iregion_9 |   .1038768   .1960173     0.53   0.596      -.28031
.4880636
 _Iregion_10 |   .3059329   .1644643     1.86   0.063    -.0164111
.628277
 _Iregion_11 |   .6352331   .1794982     3.54   0.000     .2834232
.987043
 _Iregion_12 |   .2496245   .1486112     1.68   0.093    -.0416481
.5408972
 _Iregion_13 |  -.0800528   .2176679    -0.37   0.713    -.5066741
.3465685
 _Iregion_14 |   .0798718   .1437676     0.56   0.579    -.2019076
.3616511
 _Iregion_15 |  -.0130019   .2603814    -0.05   0.960    -.5233402
.4973363
 _Iregion_16 |  -.3708999   .2419471    -1.53   0.125    -.8451076
.1033077
 _Iregion_17 |   .0353583   .2251124     0.16   0.875    -.4058539
.4765705
 _Iregion_18 |  -.4824453   .2837713    -1.70   0.089    -1.038627
.0737363
_Itown_siz~2 |  -.1070012   .1373188    -0.78   0.436    -.3761412
.1621388
_Itown_siz~3 |   -.170247   .1302397    -1.31   0.191     -.425512
.085018
_Itown_siz~4 |  -.3958154   .1658262    -2.39   0.017    -.7208289
-.070802
_Itown_siz~5 |  -.1711893    .134986    -1.27   0.205    -.4357569
.0933783
_Itown_siz~6 |  -.4062654    .181368    -2.24   0.025    -.7617402
-.0507906
_Itown_siz~7 |  -.0472928   .2159213    -0.22   0.827    -.4704907
.3759051
       _cons |  -18.87778   .4562823   -41.37   0.000    -19.77208
-17.98348
-------------+--------------------------------------------------------------
--
    /lnalpha |   17.70275          .                             .
.
-------------+--------------------------------------------------------------
--
       alpha |   4.88e+07          .                             .
.
----------------------------------------------------------------------------
--

And then the marginal effects are extremely small (This happens for all
coefficients, I am reporting only some of them):

Marginal effects after ztnb
      y  = predicted number of events (predict)
         =  1.403e-08
----------------------------------------------------------------------------
--
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+------------------------------------------------------------------
--
  gender*|   1.82e-09      .00000    1.99   0.047   2.6e-11  3.6e-09
.66098
   sport*|  -2.78e-09      .00000   -2.95   0.003  -4.6e-09 -9.3e-10
.599909
_Iimmi~1*|  -3.16e-09      .00000   -1.20   0.232  -8.3e-09  2.0e-09
.010804
_Iimmi~2*|   1.02e-09      .00000    0.51   0.613  -2.9e-09  5.0e-09
.048448
----------------------------------------------------------------------------
--
(*) dy/dx is for discrete change of dummy variable from 0 to 1

The same happens if I use with -prchange- of the ado -spost-

. prchange

ztnb: Changes in Unconditional Rate for visitsGP

              min->max      0->1     -+1/2    -+sd/2
      gender    0.0000    0.0000    0.0000    0.0000
         age   -0.0000   -0.0000   -0.0000   -0.0000
        age2    0.0000    0.0000    0.0000    0.0000
    chronic1    0.0000    0.0000    0.0000    0.0000
    chronic2    0.0000    0.0000    0.0000    0.0000
    accident    0.0000    0.0000    0.0000    0.0000
      smoker   -0.0000   -0.0000   -0.0000   -0.0000
       sport   -0.0000   -0.0000   -0.0000   -0.0000
_Iimmigran~1   -0.0000   -0.0000   -0.0000   -0.0000
_Iimmigran~2    0.0000    0.0000    0.0000    0.0000



I have not found until now a satisfactory response to my question. The only
thing I can figure out is that most of observations are 1 and there is very
little variability. They are more or less like that:


      Monthly |
  visits to |
         GP |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      7,065       79.40       79.40
          2 |      1,150       12.92       92.32
          3 |        310        3.48       95.81
          4 |        278        3.12       98.93
          5 |         36        0.40       99.34
          6 |         13        0.15       99.48
          7 |          8        0.09       99.57
          8 |         13        0.15       99.72
          9 |          4        0.04       99.76
         10 |          7        0.08       99.84
         12 |          4        0.04       99.89
         14 |          2        0.02       99.91
         15 |          3        0.03       99.94
         20 |          5        0.06      100.00
------------+-----------------------------------
      Total |      8,898      100.00


If I use the poisson distribution -ztp- results appear to be more logical.
Maybe because the poisson distribution is more parsimonious.

Marginal effects after ztp
      y  = predicted number of events (predict)
         =  .56415269
----------------------------------------------------------------------------
--
variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
---------+------------------------------------------------------------------
--
  gender*|   .0239736      .03491    0.69   0.492   -.04444  .092388
.66098
   sport*|   -.106718      .03247   -3.29   0.001  -.170357 -.043079
.599909
_Iimmi~1*|  -.1111744      .09691   -1.15   0.251  -.301124  .078775
.010804
_Iimmi~2*|   .0154857      .06213    0.25   0.803  -.106287  .137259
.048448
----------------------------------------------------------------------------
--
(*) dy/dx is for discrete change of dummy variable from 0 to 1

And when I use count data where there are much more variability, I have no
problem with convergence or marginal effects, for example data distributed
as follows:

visitshospi	
tal	Freq.	Percent	Cum.
			
1	459	18.53	18.53
2	307	12.39	30.92
3	252	10.17	41.10
4	153	6.18	47.27
5	139	5.61	52.89
6	70	2.83	55.71
7	126	5.09	60.80
8	143	5.77	66.57
9	69	2.79	69.36
10	111	4.48	73.84
11	48	1.94	75.78
12	49	1.98	77.76
13	18	0.73	78.48
14	64	2.58	81.07
15	96	3.88	84.94
16	26	1.05	85.99
17	33	1.33	87.32
18	17	0.69	88.01
19	14	0.57	88.57
20	34	1.37	89.95
21	16	0.65	90.59
22	34	1.37	91.97
23	16	0.65	92.61
24	10	0.40	93.02
25	7	0.28	93.30
26	8	0.32	93.62
27	7	0.28	93.90
28	8	0.32	94.23
29	9	0.36	94.59
30	33	1.33	95.92
31	1	0.04	95.96
32	9	0.36	96.33
33	1	0.04	96.37
34	4	0.16	96.53
35	4	0.16	96.69
36	4	0.16	96.85
37	10	0.40	97.25
38	2	0.08	97.34
39	1	0.04	97.38
40	2	0.08	97.46
41	2	0.08	97.54
42	2	0.08	97.62
43	3	0.12	97.74
44	3	0.12	97.86
45	4	0.16	98.02
47	3	0.12	98.14
50	3	0.12	98.26
51	1	0.04	98.30
52	5	0.20	98.51
54	3	0.12	98.63
55	1	0.04	98.67
56	1	0.04	98.71
57	1	0.04	98.75
58	2	0.08	98.83
60	6	0.24	99.07
65	1	0.04	99.11
66	1	0.04	99.15
67	2	0.08	99.23
70	1	0.04	99.27
75	1	0.04	99.31
76	1	0.04	99.35
77	1	0.04	99.39
83	1	0.04	99.43
84	1	0.04	99.48
90	5	0.20	99.68
93	1	0.04	99.72
111	2	0.08	99.80
150	1	0.04	99.84
180	2	0.08	99.92
213	1	0.04	99.96
232	1	0.04	100.00
			
Total	2,477	100.00


Could anyone know why this happen? Maybe it is because most of observations
are 1 and I am using a -ztnb-, which is not very parsimonious to estimate
and simply I should use a negative binomial regression with all data or
estimate a logit or a probit only for the probability of having one count.

Thank you very much,

Jos� Ignacio Ant�n

Department of Economics
University of Salamanca



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: Re: st: confidence interval for median
Next by Date: st: Problem with outreg2 in a do file
Previous by thread: st: -reshape long-
Next by thread: st: Problem with outreg2 in a do file
Index(es):
- Date
- Thread