Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: re:re: data creation for hazard regression |

Date |
Mon, 11 Jun 2012 12:24:47 -0400 |

Kenisha Russell <kenisha.russell@framtidsstudier.se>: The first link I gave in http://www.stata.com/statalist/archive/2012-06/msg00472.html has very detailed advice on dealing with discrete time models: https://www.iser.essex.ac.uk/resources/survival-analysis-with-stata-module-ec968 I still don't see why you replace missing birthdates with 999999. If you -expand- as detailed in the web course linked above, you will wind up with a very large dataset, so you may prefer to ignore the discreteness of the data. It is still not clear exactly what your analysis consists of--what are the competing risks? Marriage and cohabitation? When is the onset of risk? Earliest possible age of marriage/cohab? If you have each month from age 12 to age 50, you will have many observations, so you will need to streamline the variables in order to keep the dataset size small enough to work with. Then you can define variables in terms of what was true in that month for that person. If you are using the discrete time models as above, you do not need to -stset- etc. but to have each person have an observation in each month. However: The data snippet you gave does not have that structure. If you had State 11, you could use -stcrreg- but also findit stcompet or http://www.statajournal.com/sjpdf.html?articlenum=st0059 On Mon, Jun 11, 2012 at 5:58 AM, Kenisha Russell <kenisha.russell@framtidsstudier.se> wrote: > Hi Austin, > > I am using stata 10.1 > Thank you for taking the time out to answer my question. I do apologise, if I was not clear, it was my first time posting. > > Research goal: Using a discrete time competing risks hazard model I would like to analyse entry into first union (i.e marriage or cohabitation), and pregnancy is one of the explanatory variables. > > The data has been transformed into person-months (i.e what I previously referred to as century-months data) > Because I had the year and month in which each child was born, I then executed the steps outlined below: > Step 1: I created childbearing histories > /* Create century months for birth of each child (here maximum # of children is 3). > using a loop running the code first for 1st, then 2nd, then 3rd child */ > forval x = 1/3 { > gen CMchild`x'=ym(childy`x', childm`x') > recode CMchild`x'.=999999 > } > > > Step 2: Then in order to create a variable for pregnancy I. > gen CMpregnancy=. > forval x = 1/3 { > replace CMpregnancy`x'=CMchild`x'-7 if CMchild`x'!=999999 > replace CMpregnancy`x'==999999 if CMpregnancy`x'===. > } > > After stset, and running the above commands my data currently looks like this. > > id _t0 _t _d _st _origin CMchild1 CMchild2 CMchild3 > 3 0 68 0 1 1997m6 583 999999 999999 > 4 75 278 0 1 1985m10 999999 999999 999999 > 11 476 0 1 1969m4 248 338 999999 > 12 258 0 1 1987m6 401 424 509 > 13 27 230 0 1 1989m10 421 999999 999999 > 14 0 198 0 1 1992m6 999999 999999 999999 > 15 68 86 1 1 1986m5 476 999999 999999 > > I have checked the math and the outcomes seem to be correct, for example for Id # 11 where CMchild==338, CMpregnancy1 calculated from 7 months before was at time 331. > So my question is, if the above is correct, do I now need to stsplit the dataset so that there's one data row per person per month at risk of pregnancy? > > If I do split the event, if my reasoning is correct I assume I would need to stop each pregnancy at the point where each child is born. > Is that correct? If so, how would I do that? You suggested that I created a contemporaneous time variable, can you explain how? > > Also With regards to your earlier question Austin: > Are you sure every child is a biological child? Yes, I am sure all the children are biological > Are there women with more than 3 children in the data? There are no women with more than 3 children in this data > Do you have any information on gestational age at birth? I have no information about gestational age at birth. > > > I hope that this time my goal and question is much clearer. > Best, > Kenisha * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: re:re: data creation for hazard regression***From:*Kenisha Russell <kenisha.russell@framtidsstudier.se>

- Prev by Date:
**RE: st: macro problem** - Next by Date:
**st: instrumental variable with ordered logit** - Previous by thread:
**st: re:re: data creation for hazard regression** - Next by thread:
**st: Cluster standard errors by time and firm** - Index(es):