Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Artificial censoring in survival analysis

From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Artificial censoring in survival analysis
Date   Thu, 4 Aug 2011 17:22:51 -0400

I am answering your second question about -hshaz-.  There are examples of two and three mass points at the end of the -help-.  The mixture model for heterogeneity means that the unobserved log hazard is at one of those points, with locations and probabilities to be estimated.

For your earlier question.

I don't see a good reason for censoring individuals at 12 months because of problems in observing other individuals.  However until you describe your data more fully, then I really don't know.

• What kind of study generated the data. A prospective cohort?.  A cross-section with retrospective recall? 

• Was the study a complex sample, so that there are weights and clusters (PSUs)?  

• What is the purpose of YOUR analysis?

• What was the larger data set, if any, from which you took your specific data.  What criteria did you use for inclusions?

• What is month "1"?   a calendar month, a month of an interview?  The first month of unemployment?

• Did unemployment start before month "1" for everybody or some people?  After month 1?

• For those who started before month "1", do you know how long they had been unemployed?

• What do you mean people were "younger" to experience the event?  Did you mean "too young" to qualify as unemployed at the start?  

• Why do you have information on some people for more than 12 months but not for others?  How did observation end.  

• Have you information on people who were employed but became unemployed during the study period (perhaps not in the data set you describe below.

In short we need a complete description of the study design and the beginning and endinfg of observation.

Dear statalisters,

I am doing a project on duration of unemployment. I want to compare models with and without unobserved heterogeneity. I want to use -hshaz- module to estimate a mixture model but I couldn't find example on how to do that. I will appreciate any help where to find examples.


On Aug 2, 2011, at 3:25 AM, [email protected] wrote:

Hello statalisters,

I analyze employment data using survival method for a length of 12 months. I decided to do so because some of my observations are younger to experience the event (in this case exiting unemployment) for more than 12 months; that is I observe them only for 12 months. To overcome this problem I imposed a 12 months period of analysis for all of my observations. That is all observations have equal length of 12 months to experience the event. I did so by artificially censoring those observations for whom I have data for more than 12 months and did not experience the event within 12 months. These are old individuals. I did censor even though I see some of these observations experience the event later, after the 12 months period.

My questions: 
1. Should I include in the analysis those observations that I censored?
2. Is the sample data presented below appropriate for survival analysis? Note that all of observations experience the event except those I censored at the 12 month.

Below is a small representation of my data. The failure variable 'Failure' is cross-tabulated with the variable 'studytime' which is the number of months until experiencing the event.

  0 | 1
1    0 | 200
2    0 | 89
3    0 | 70
5    0 | 68
6    0 | 58
7    0 | 50
8    0 | 51
10   0 | 45
11   0 | 30
12   150 | 0


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index