[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Jenkins S P <stephenj@essex.ac.uk> |

To |
Statalist <statalist@hsphsun2.harvard.edu> |

Subject |
st: unobs. heterogeneity in discrete hazard models |

Date |
Sat, 24 Sep 2005 18:43:49 +0100 (BST) |

Gijs Dekkers asked about the interpretation of his estimates (reproduced below). My inclination would be to interpret these as he does, i.e. as suggesting that there is no 'significant' heterogeneity. If you added the trace option to the -pgmhaz8- command, you'll probably see the -ml- evaluator trying to estimate smaller and smaller values of the gamma variance but zero can never be reached given the way the model is parameterised (you'll see the logvariance becoming a larger and larger negative number). Further reassurance that the results are not a quirk might be gained by experimenting with different starting values for the gamma variance and seeing whether get same behaviour. You could also see what happens if you model frailty using discrete mass point approach (see -hshaz-)

I suspect that part of the "problem" is related to the short length of the ECHP panel which is used to create the spell data set used here. I conjecture that this makes it harder to distinguish frailty from duration dependence.

Stephen

=============================================

Professor Stephen P. Jenkins <stephenj@essex.ac.uk>

Institute for Social and Economic Research (ISER)

University of Essex, Colchester CO4 3SQ, UK

Phone: +44 1206 873374. Fax: +44 1206 873151.

http://www.iser.essex.ac.uk

Survival Analysis Using Stata: http://www.iser.essex.ac.uk/teaching/degree/stephenj/ec968/index.php

Date: Fri, 23 Sep 2005 11:55:51 +0200 From: Gijs Dekkers <gd@plan.be> Subject: st: unobserved hetereogeneity and duration: interpreting pgmhaz8 and xtclog Dear fellow Stata-users, I am estimating a discrete duration model, explaining the probability that a cohabiting (unmarried) individual (cohab=1) separates i.e. no longer consensual union and not married after a certain time (the variable 'duration'). The dataset is the European Comunity Household Panel ECHP. The variables are pid: unique person identifier duration: time (years) cosep: 0 if the individual lives in consensual union, 1=if (s)he does not live in consensual union (and is not married) The data is of the following form: +----------------------------+ | pid duration cosep | |----------------------------| 1. | 1028101 1 0 | 2. | 1028101 2 0 | 3. | 1028105 1 0 | 4. | 1028105 2 0 | 5. | 2053101 1 0 | |----------------------------| 6. | 2053102 1 0 | 7. | 3023101 1 0 | 8. | 3023101 2 0 | 9. | 3023101 3 1 | etc... A first analysis (somewhat dissapointingly) showed that the only significant explanatory variables are a function of duration. In fact, the best model explains 'cosep' using 'duration' and its quadrature 'duration2' . cloglog cosep duration duration2 Iteration 0: log likelihood = -377.28398 Iteration 1: log likelihood = -377.24866 Iteration 2: log likelihood = -377.24865 Complementary log-log regression Number of obs = 2410 Zero outcomes = 2319 Nonzero outcomes = 91 LR chi2(2) = 20.35 Log likelihood = -377.24865 Prob > chi2 = 0.0000 - ------------------------------------------------------------------------------ cosep | Coef. Std. Err. z P>|z| [95% Conf. Interval] - -------------+---------------------------------------------------------------- duration | .8310887 .2263603 3.67 0.000 .3874307 1.274747 duration2 | -.0825692 .0272601 -3.03 0.002 -.135998 - -.0291405 _cons | -4.782091 .4091147 -11.69 0.000 -5.583941 - -3.980241 - ------------------------------------------------------------------------------ However, I want to test for various parametric forms of frailty, using Jenkins' Lesson 7 on 'unobserved heterogeneity' (http://www.iser.essex.ac.uk/teaching/degree/stephenj/ec968/#_Toc520705914). First, he suggests to test for heterogeneity assuming a normally distributed frailty term (page 14). . xtclog cosep duration duration2, nolog i(pid) Random-effects complementary log-log model Number of obs = 2410 Group variable (i): pid Number of groups = 739 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 3.3 max = 8 Wald chi2(2) = 18.20 Log likelihood = -377.24865 Prob > chi2 = 0.0001 - ------------------------------------------------------------------------------ cosep | Coef. Std. Err. z P>|z| [95% Conf. Interval] - -------------+---------------------------------------------------------------- duration | .8310886 .2263602 3.67 0.000 .3874307 1.274747 duration2 | -.0825692 .0272601 -3.03 0.002 -.135998 - -.0291404 _cons | -4.782091 .4091147 -11.69 0.000 -5.583941 - -3.980241 - -------------+---------------------------------------------------------------- /lnsig2u | -14 . . . - -------------+---------------------------------------------------------------- sigma_u | .0009119 . . . rho | 5.06e-07 . . . - ------------------------------------------------------------------------------ Likelihood-ratio test of rho=0: chibar2(01) = 0.00 Prob >= chibar2 = 1.000 Now this already looks pretty strange to me, or is it my suspicious mind? Can I safely coclude that the hypothesis of normally distributed unobserved heterogeneity shoud (very much) be rejected? Secondly, I used pgmhaz8 to test for gamma-distributed unobserved heterogeneity. I found the pgmhaz8-manual at http://ideas.repec.org/c/boc/bocode/s438501.html If I understand this manual correctly (but I am not quite sure), the model should be . pgmhaz8 duration2, id(pid) dead(cosep) seq(duration) (anyway, the model pgmhaz8 duration duration2 etc. does not converge) The results are: PGM hazard model without gamma frailty Generalized linear models No. of obs = 2410 Optimization : ML Residual df = 2408 Scale parameter = 1 Deviance = 769.387838 (1/df) Deviance = .3195132 Pearson = 2400.608032 (1/df) Pearson = .9969302 Variance function: V(u) = u*(1-u) [Bernoulli] Link function : g(u) = ln(-ln(1-u)) [Complementary log-log] AIC = .3209078 Log likelihood = -384.693919 BIC = - -17982.63 - ------------------------------------------------------------------------------ | OIM cosep | Coef. Std. Err. z P>|z| [95% Conf. Interval] - -------------+---------------------------------------------------------------- duration2 | .0135626 .0055006 2.47 0.014 .0027817 .0243435 _cons | -3.456552 .1401122 -24.67 0.000 -3.731167 - -3.181937 - ------------------------------------------------------------------------------ Iteration 0: log likelihood = -385.00279 Iteration 1: log likelihood = -384.79069 Iteration 2: log likelihood = -384.73062 Iteration 3: log likelihood = -384.70334 Iteration 4: log likelihood = -384.69612 Iteration 5: log likelihood = -384.6944 Iteration 6: log likelihood = -384.69403 Iteration 7: log likelihood = -384.69394 Iteration 8: log likelihood = -384.69392 Iteration 9: log likelihood = -384.69392 Iteration 10: log likelihood = -384.69392 PGM hazard model with gamma frailty Number of obs = 2410 LR chi2() = . Log likelihood = -384.69392 Prob > chi2 = . - ------------------------------------------------------------------------------ cosep | Coef. Std. Err. z P>|z| [95% Conf. Interval] - -------------+---------------------------------------------------------------- hazard | duration2 | .0135593 .0055347 2.45 0.014 .0027114 .0244072 _cons | -3.456778 .141079 -24.50 0.000 -3.733287 - -3.180268 - -------------+---------------------------------------------------------------- ln_varg | _cons | -13.77345 952.569 -0.01 0.988 -1880.774 1853.228 - -------------+---------------------------------------------------------------- Gamma var. | 1.04e-06 .0009935 0.00 0.999 0 . - ------------------------------------------------------------------------------ LR test of Gamma var. = 0: chibar2(01) = -8.9e-06 Prob.>=chibar2 = .5 And here it is again: analogous to the results from the xtclog, the hypothesis of gamma-distributed unobserved heterogeneity should be rejected. However, again like the xtclog results, the above results of pgmhaz8 suspiciously look like some sort of corner solution, or an artefact. And this (finally!) brings me to my question: can I trust these results and safely conclude that the hypotheses of unobserved hetereogeneity (both normally and gamma-distributed) should be rejected? Or is there something else going on? If so, any suggestions? Any help would be appreciated! Gijs

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: unobs. heterogeneity in discrete hazard models***From:*Gijs Dekkers <gd@plan.be>

- Prev by Date:
**st: Re: using SAS-like arrays to create new variables** - Next by Date:
**st: NLLS problem** - Previous by thread:
**st: using SAS-like arrays to create new variables** - Next by thread:
**Re: st: unobs. heterogeneity in discrete hazard models** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |