Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Kenisha Russell <[email protected]> |

To |
"[email protected]" <[email protected]> |

Subject |
st: re: data creation for hazard regression |

Date |
Fri, 8 Jun 2012 08:50:48 +0000 |

Hi Statalisters, I am trying to create a data set for which I will use hazard regression (events history analysis to demographers). I am currently restructuring my data into person-period format, in order to use hazard regression to examine the propensity of an individual to transition from state x to state y. and one of the variables that I want to use is pregnancy. Because I have the day and month each child was born, after making this date into century month format, I have simply subtracted the 7 months previous to the birth of each child to obtain a variable called pregnancy. In this particular data set the highest recorded parity is 3. See the syntax I have used below. gen CMpregnancy1=. replace CMpregnancy1=CMchild1-7 if CMchild1!=999999 CMchild is the birthdate of the each child is in century month format. After this I then split the data: stsplit pregnancy1, after(CMpregnancy1) at(0) /* We replace values for pregnancy1 so that 0 represents time before that the woman was pregnant and 1 for after the pregnancy*/ replace pregnancy1= pregnancy1+1 replace pregnancy1=0 if CMpregnancy1==. list pid-_st CMpregnancy* pregnancy* in 1/60 This is repeated three times because given the fact that highest parity is = 3, the likelihood of pregnancy is also 3 and all should be taken into account. Although I have written a syntax here and have split the data, my issue is that I am not sure it is correct. Am I required to split the data with each pregnancy? i.e to create a time = before the event (i.e the pregnancy). If I do split the event, if my reasoning is correct I assume I would need to stop each pregnancy at the point where each child is born. Is that correct? If so, How would I do that? Best, Kenisha * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: re: data creation for hazard regression***From:*Austin Nichols <[email protected]>

- Prev by Date:
**RE: st: RE: RE: Re: Loglinear quasi-symmetric agreement** - Next by Date:
**Re: st: using information from value label to generate new variables** - Previous by thread:
**st: imputation missing values** - Next by thread:
**Re: st: re: data creation for hazard regression** - Index(es):