[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Gijs Dekkers <gd@plan.be> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: unobs. heterogeneity in discrete hazard models |

Date |
Mon, 26 Sep 2005 11:07:19 +0200 |

Dear Stephen,

Thank you for your reply.

1. I indeed see the logvariance continuously increasing.

2. I have experienced with different starting values for the gamma variance, ranging from -0.99 to 0.3. Apart from a region around -0.5 and at 0.3 where the log-likelihood is not concave and no solution is reached, all other starting values give the same results as below.

3. hshaz does not reach a solution, even after considerable experimenting with the number of discrete points, their starting values and starting probabilities.

If your suspicion on the problem being the short length of the ECHP would be correct (and I agree that it is a problem, mind you), then wouldn't the same problem appear in every model estimated with ECHP? To test this, I have estimated a completely different duration model, explaining whether or not somebody ceases to be in full-time education after a while, given that he or she was in full time education before. The best model explained this using age, age squared, current level of education, duration and duration squared.

In this case, the xtclog showed that the hypothesis that rho is zero must be rejected, so normally-distributed unobserved heterogeneity is statistically significant ( Prob >= chibar2 = 0.002).

pgmhaz8 led to the conclusion that unobserved hetereogeneity, assuming a gamma-distribution, is not statistically significant (Prob.>=chibar2 = .07891).

The nonparametric hshaz reaches a solution as well, though I have no idea yet how to interpret it*.

Anyway, wouldn't you agree that the fact that model II (exiting full-time education) shows significant normally-distributed frailty, means that the fact that I do not find frailty in model I (separation of cohabiting couples) is not the result of the short time-span of the ECHP, but of that it just isn't there?

*This immediately brings me to another question question: do you -does anybody- know of or have a paper where hshaz is applied and the estimation results are presented and discussed?

Thanks a lot,

Gijs

However, I have tried a completely different model (

Jenkins S P wrote:

Gijs Dekkers asked about the interpretation of his estimates (reproduced below). My inclination would be to interpret these as he does, i.e. as suggesting that there is no 'significant' heterogeneity. If you added the trace option to the -pgmhaz8- command, you'll probably see the -ml- evaluator trying to estimate smaller and smaller values of the gamma variance but zero can never be reached given the way the model is parameterised (you'll see the logvariance becoming a larger and larger negative number). Further reassurance that the results are not a quirk might be gained by experimenting with different starting values for the gamma variance and seeing whether get same behaviour. You could also see what happens if you model frailty using discrete mass point approach (see -hshaz-)

I suspect that part of the "problem" is related to the short length of the ECHP panel which is used to create the spell data set used here. I conjecture that this makes it harder to distinguish frailty from duration dependence.

Stephen

=============================================

Professor Stephen P. Jenkins <stephenj@essex.ac.uk>

Institute for Social and Economic Research (ISER)

University of Essex, Colchester CO4 3SQ, UK

Phone: +44 1206 873374. Fax: +44 1206 873151.

http://www.iser.essex.ac.uk

Survival Analysis Using Stata: http://www.iser.essex.ac.uk/teaching/degree/stephenj/ec968/index.php

Date: Fri, 23 Sep 2005 11:55:51 +0200

From: Gijs Dekkers <gd@plan.be>

Subject: st: unobserved hetereogeneity and duration: interpreting pgmhaz8 and xtclog

Dear fellow Stata-users,

I am estimating a discrete duration model, explaining the probability

that a cohabiting (unmarried) individual (cohab=1) separates

i.e. no longer consensual union and not married after a certain time

(the variable 'duration'). The dataset is the European Comunity

Household Panel ECHP.

The variables are

pid: unique person identifier

duration: time (years)

cosep: 0 if the individual lives in consensual union, 1=if (s)he

does not live in consensual union (and is not married)

The data is of the following form:

+----------------------------+

| pid duration cosep |

|----------------------------|

1. | 1028101 1 0 |

2. | 1028101 2 0 |

3. | 1028105 1 0 |

4. | 1028105 2 0 |

5. | 2053101 1 0 |

|----------------------------|

6. | 2053102 1 0 |

7. | 3023101 1 0 |

8. | 3023101 2 0 |

9. | 3023101 3 1 |

etc...

A first analysis (somewhat dissapointingly) showed that the only

significant explanatory variables are a function of duration. In fact,

the best model explains 'cosep' using 'duration' and its quadrature

'duration2'

. cloglog cosep duration duration2

Iteration 0: log likelihood = -377.28398

Iteration 1: log likelihood = -377.24866

Iteration 2: log likelihood = -377.24865

Complementary log-log regression Number of obs

= 2410

Zero outcomes

= 2319

Nonzero outcomes

= 91

LR chi2(2) =

20.35

Log likelihood = -377.24865 Prob > chi2 =

0.0000

- ------------------------------------------------------------------------------

cosep | Coef. Std. Err. z P>|z| [95% Conf.

Interval]

- -------------+----------------------------------------------------------------

duration | .8310887 .2263603 3.67 0.000 .3874307

1.274747

duration2 | -.0825692 .0272601 -3.03 0.002 -.135998

- -.0291405

_cons | -4.782091 .4091147 -11.69 0.000 -5.583941

- -3.980241

- ------------------------------------------------------------------------------

However, I want to test for various parametric forms of frailty, using

Jenkins' Lesson 7 on 'unobserved heterogeneity'

(http://www.iser.essex.ac.uk/teaching/degree/stephenj/ec968/#_Toc520705914).

First, he suggests to test for heterogeneity assuming a normally

distributed frailty term (page 14).

. xtclog cosep duration duration2, nolog i(pid)

Random-effects complementary log-log model Number of obs

= 2410

Group variable (i): pid Number of groups

= 739

Random effects u_i ~ Gaussian Obs per group: min

= 1

avg

= 3.3

max

= 8

Wald chi2(2) =

18.20

Log likelihood = -377.24865 Prob > chi2 =

0.0001

- ------------------------------------------------------------------------------

cosep | Coef. Std. Err. z P>|z| [95% Conf.

Interval]

- -------------+----------------------------------------------------------------

duration | .8310886 .2263602 3.67 0.000 .3874307

1.274747

duration2 | -.0825692 .0272601 -3.03 0.002 -.135998

- -.0291404

_cons | -4.782091 .4091147 -11.69 0.000 -5.583941

- -3.980241

- -------------+----------------------------------------------------------------

/lnsig2u | -14 .

. .

- -------------+----------------------------------------------------------------

sigma_u | .0009119 .

. .

rho | 5.06e-07 .

. .

- ------------------------------------------------------------------------------

Likelihood-ratio test of rho=0: chibar2(01) = 0.00 Prob >= chibar2 =

1.000

Now this already looks pretty strange to me, or is it my suspicious

mind? Can I safely coclude that the hypothesis of normally distributed

unobserved heterogeneity shoud (very much) be rejected?

Secondly, I used pgmhaz8 to test for gamma-distributed unobserved

heterogeneity. I found the pgmhaz8-manual at

http://ideas.repec.org/c/boc/bocode/s438501.html

If I understand this manual correctly (but I am not quite sure), the

model should be

. pgmhaz8 duration2, id(pid) dead(cosep) seq(duration)

(anyway, the model pgmhaz8 duration duration2 etc. does not converge)

The results are:

PGM hazard model without gamma frailty

Generalized linear models No. of obs

= 2410

Optimization : ML Residual df

= 2408

Scale parameter

= 1

Deviance = 769.387838 (1/df) Deviance =

.3195132

Pearson = 2400.608032 (1/df) Pearson =

.9969302

Variance function: V(u) = u*(1-u) [Bernoulli]

Link function : g(u) = ln(-ln(1-u)) [Complementary log-log]

AIC =

.3209078

Log likelihood = -384.693919 BIC =

- -17982.63

- ------------------------------------------------------------------------------

| OIM

cosep | Coef. Std. Err. z P>|z| [95% Conf.

Interval]

- -------------+----------------------------------------------------------------

duration2 | .0135626 .0055006 2.47 0.014 .0027817

.0243435

_cons | -3.456552 .1401122 -24.67 0.000 -3.731167

- -3.181937

- ------------------------------------------------------------------------------

Iteration 0: log likelihood = -385.00279

Iteration 1: log likelihood = -384.79069

Iteration 2: log likelihood = -384.73062

Iteration 3: log likelihood = -384.70334

Iteration 4: log likelihood = -384.69612

Iteration 5: log likelihood = -384.6944

Iteration 6: log likelihood = -384.69403

Iteration 7: log likelihood = -384.69394

Iteration 8: log likelihood = -384.69392

Iteration 9: log likelihood = -384.69392

Iteration 10: log likelihood = -384.69392

PGM hazard model with gamma frailty Number of obs

= 2410

LR chi2()

= .

Log likelihood = -384.69392 Prob > chi2

= .

- ------------------------------------------------------------------------------

cosep | Coef. Std. Err. z P>|z| [95% Conf.

Interval]

- -------------+----------------------------------------------------------------

hazard |

duration2 | .0135593 .0055347 2.45 0.014 .0027114

.0244072

_cons | -3.456778 .141079 -24.50 0.000 -3.733287

- -3.180268

- -------------+----------------------------------------------------------------

ln_varg |

_cons | -13.77345 952.569 -0.01 0.988 -1880.774

1853.228

- -------------+----------------------------------------------------------------

Gamma var. | 1.04e-06 .0009935 0.00 0.999

0 .

- ------------------------------------------------------------------------------

LR test of Gamma var. = 0: chibar2(01) = -8.9e-06 Prob.>=chibar2

= .5

And here it is again: analogous to the results from the xtclog, the

hypothesis of gamma-distributed unobserved heterogeneity should be

rejected. However, again like the xtclog results, the above results of

pgmhaz8 suspiciously look like some sort of corner solution, or an

artefact.

And this (finally!) brings me to my question: can I trust these results

and safely conclude that the hypotheses of unobserved hetereogeneity

(both normally and gamma-distributed) should be rejected? Or is there

something else going on? If so, any suggestions?

Any help would be appreciated!

Gijs

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

--

dr. Gijs Dekkers

Federal Planning Bureau

Kunstlaan 47-49

1000 Brussels, Belgium

++32/(0)2/5077413

fax 7373 gd@plan.be, gijs.dekkers@soc.kuleuven.be

**********************************************************************

Disclaimer: This e-mail may contain confidential information

which is intended only for the use of the recipient(s) named above.

If you have received this communication in error, please

notify the sender immediately and delete this e-mail from

your system.

Please note that e-mail messages cannot be considered as official

information from the Federal Planning Bureau.

**********************************************************************

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:**st: unobs. heterogeneity in discrete hazard models***From:*Jenkins S P <stephenj@essex.ac.uk>

- Prev by Date:
**Re: st: RE: confirm** - Next by Date:
**st: Cluster estimator in panel data, areg, reg, NL , N dimension.** - Previous by thread:
**st: unobs. heterogeneity in discrete hazard models** - Next by thread:
**st: NLLS problem** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |