Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: re: data creation for hazard regression

From   Kenisha Russell <>
To   "" <>
Subject   st: re: data creation for hazard regression
Date   Fri, 8 Jun 2012 08:50:48 +0000

Hi Statalisters,
I am trying to create a data set for which I will use  hazard regression (events history analysis to demographers).
I am currently restructuring my data into person-period format, in order to use hazard regression to examine the propensity of an individual to transition from state x to state y.
and one of the variables that I want to use is pregnancy.

Because I have the day and month each child was born, after making this date into century month format, I have simply subtracted the 7 months previous to the birth of each child to obtain   a variable called pregnancy. In this particular data set the highest recorded parity is 3. See the syntax I have used below.

gen CMpregnancy1=.
replace CMpregnancy1=CMchild1-7 if CMchild1!=999999
CMchild is the birthdate of the each child is in century month format.

After this I then split the data: 
stsplit pregnancy1, after(CMpregnancy1) at(0)

/* We replace values for pregnancy1 so that 0 represents time before that 
the woman was pregnant and 1 for after the pregnancy*/
replace pregnancy1= pregnancy1+1
replace pregnancy1=0 if CMpregnancy1==.
list  pid-_st CMpregnancy*  pregnancy* in 1/60

This is repeated three times because given the fact that highest parity is = 3,  the likelihood of pregnancy is also 3 and all should be taken into account.

Although I have written a syntax here and have split the data, my issue is that I am not sure it is correct. Am I required to split the data with each pregnancy?  i.e to create a time = before the event (i.e the pregnancy).

 If I do split the event, if my reasoning is correct I assume I would need to stop each pregnancy at the point where each child is born.  Is that correct? If so, How would I do that?


*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index