[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: pgmhaz/hshaz output, why does it look like this?

From   Hilde Karlsen <>
Subject   Re: st: pgmhaz/hshaz output, why does it look like this?
Date   Sun, 22 Feb 2009 10:47:40 +0100

Thank you so much; I should have grasped that by myself I guess. Reading your Lessons at was wery helpful, and also the information on -pgmhaz8-.

However, having done some reading, I have one more question (not directly related to the subjecy line of this e-mail): How do I decide whether to use the -pgmhaz8- program or the -xtcloglog- command for my survival analysis?

If I have understood correctly, -pgmhaz8- assumes Gamma-distributed unobserved heterogeneity, while xtcloglog assumes Normally distributed unobserved heterogeneity. Is there a way to explore which one of these distribution I should rely on for my analysis? I woudl like to know how one explain/justify why one distribution was preferred before the other?

I have tested both the -pgmhaz8- program and the xtcloglog command on my data; however, only the latter command converges the estimates in a reliable (I hope) way. However, the -pgmhaz8- program resulted in a model where no standard errors were missing, but where the message before the last iteration was "nonconcave function encountered).

I used the -trace()- option to explore the iterations in -pgmhaz8-, but I am not sure how to interprete them. I tried to use the -iterate()- option, but that resulted in missing estimates for the gamma variance. I haven't experimented with the starting values for the ln(v) though.

Below is what I wrote, and some of the output:
Can I safely conclude that there is no significant frailty in my material?

Thank you so much for your consideration.

Best regards,

.xtcloglog movedout year1_4 year5_7 year8_10  children logEarn
 child_year7 child_year10 earnyear4 earnyear7 if male ==1,

             Coeff.    SE       [95% confidence interval]

/lnsig2u   -10.00304  41.82036  -91.96944     71.96336

sigma_u    .0067277   .1406775	 1.07e-20     4.23e+15
rho        .0000275   .0011507	 6.95e-41           1

Likelihood-ratio test of rho=0:
chibar2(01) = 3.1e-05 Prob >= chibar2 = 0.498

. pgmhaz8  year1_4 year5_7 year8_10  children  logEarn
 child_year7 child_year10 earnyear4 earnyear7 if male ==1,
 i (LPNR) s(year) d(movedout)

             Coeff         SE        z     P>|z|
   _cons   -9.015656     155.7742   -0.06  0.954

Gamma variance, exp(ln_varg) = .00012149;
Std.Err = .01892544; z = .00641955

Quoting "Stephen P. Jenkins" <>:


Date: Thu, 19 Feb 2009 10:30:50 +0100
From: Hilde Karlsen <>
Subject: st: pgmhaz/hshaz output, why does it look like this?

Dear statalisters,

I am having trouble understanding why the result of my pgmhaz-command
ends up like shown in the output below. (Why are there "missing"
on several of the estimates?)

Does this imply I should not use pgmhaz for my discrete time hazard
analysis? I've tried hshaz, but it yielded the same result. Moreover,
constructing the baseline hazard in a different manner (for example
log(time), or creating finer time units with dummy variables) also
not solve this problem. I am trying to find out wether there is
significant heterogeneity in the data, and the first analysis (nocons)

suggested statistically significant frailty. Should I rather be using
different command/program?

Here is my syntax and output; I hope it is readable.

. pgmhaz  year1_4 year5_7 year8_10 male, i(LPNR) s(year) d(movedout)

PGM hazard model with Gamma heterogeneity

Number of obs	=   16181
Model chi2(3)	=       .
Prob > chi2	=       .
Log Likelihood =  -1003.6567957

movedout    Coef.      Std. Err.      z	  P>z

year1_4   -.5213477   .1809304    -2.88	  0.004
year5_7   -.0568201          .        .	    .
year8_10   .1837397    .218296     0.84	  0.400
_cons   -4.199123     .1344249   -31.24	  0.000

_cons   -14.54299          .        .	.

Gamma variance, exp(ln_varg) = 4.831e-07; Std.	Err. = 0; z = .
- --------


If you wish to include dummy variables for all of the year* duration
intervals, then you need to exclude the constant term from the
regressor list. (Alternatively, drop one of the year* variables.) In
short, you have a perfect collinearity issue with your syntax. The
same applies to -hshaz- syntax use.  I am sceptical about your claim
that the same problem arose when you used log(duration) instead of the
year* dummies.  But then, contrary to Statalist FAQ recommendations,
you haven't showed exactly what you typed in that case.

An estimate of -14.5 for log(gamma variance) implies an estimate of
the gamma variance of near-enough zero. This is probably telling you
that frailty (unobserved heterogeneity) is hard to find with your
data.  You can investigate this further by tracing the path of
estimates at each step, and experimenting with different starting
values for the log(gamma variance). Fix the collinearity issue first

I strongly recommend reading the appropriate Lesson at my website (and
the help files) for discussion of these issues, with illustrations.
URL is in my signature below.

Finally, you should update and use -pgmhaz8- rather than -pgmhaz-
(assuming you have version 8 or higher)

Professor Stephen P. Jenkins <>
Director, Institute for Social and Economic Research
University of Essex, Colchester CO4 3SQ, U.K.
Tel: +44 1206 873374.  Fax: +44 1206 873151.
Survival Analysis using Stata:
Downloadable papers and software:

Learn about the UK's new household panel survey, "Understanding

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index