Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: re:re: data creation for hazard regression


From   Kenisha Russell <kenisha.russell@framtidsstudier.se>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: re:re: data creation for hazard regression
Date   Mon, 11 Jun 2012 09:58:08 +0000

Hi Austin,

I am using stata 10.1 
Thank you for taking the time out to answer my question. I do apologise, if I was not clear, it was my first time posting. 

Research goal: Using a discrete time competing risks hazard model I would like to analyse entry into first union (i.e marriage or cohabitation), and pregnancy is one of the explanatory variables. 

The data has been transformed into person-months (i.e what I previously referred to as century-months data)
Because I had the year and month in which each child was born, I then executed the steps outlined below:
Step 1: I created childbearing histories
/* Create century months for birth of each child (here maximum # of children is 3).
using a loop running the code first for 1st, then 2nd, then 3rd child */
forval x = 1/3 {
gen CMchild`x'=ym(childy`x', childm`x')
recode CMchild`x'.=999999
}


Step 2: Then in order to create a variable for pregnancy I.
gen CMpregnancy=.
forval x = 1/3 {
replace CMpregnancy`x'=CMchild`x'-7 if CMchild`x'!=999999
replace CMpregnancy`x'==999999 if CMpregnancy`x'===.
}

After stset, and running the above commands my data currently looks like this. 

id	_t0	_t	_d	_st	_origin	CMchild1	CMchild2	CMchild3
3	0	68	0	1	1997m6	583	999999	999999
4	75	278	0	1	1985m10	999999	999999	999999
11	476	0	1	1969m4	248	338	999999
12	258	0	1	1987m6	401	424	509
13	27	230	0	1	1989m10	421	999999	999999
14	0	198	0	1	1992m6	999999	999999	999999
15	68	86	1	1	1986m5	476	999999	999999

I have checked the math and the outcomes seem to be correct, for example for Id # 11  where CMchild==338, CMpregnancy1  calculated from 7 months before was at time 331. 
So my question is, if the above is correct, do I now need to stsplit the dataset so that there's one data row per person per month at risk of pregnancy? 

If I do split the event, if my reasoning is correct I assume I would need to stop each pregnancy at the point where each child is born. 
 Is that correct? If so, how would I do that? You suggested that I created a contemporaneous time variable, can you explain how? 

Also With regards to your earlier question Austin:
Are you sure every child is a biological child? Yes, I am sure all the children are biological
Are there women with more than 3 children in the data? There are no women with more than 3 children in this data
Do you have any information on gestational age at birth? I have no information about gestational age at birth.


I hope that this time my goal and question is much clearer.
Best,
Kenisha


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index