Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: stset and the NLSY97

From   "Nick Cox" <>
To   <>
Subject   st: RE: stset and the NLSY97
Date   Mon, 17 Oct 2005 02:01:47 +0100

Not my field, but your dummy calculation can 
be put more succinctly: 

gen sa = firstsex_yr <= age

However, safer would be to trap missings: 

gem sa = cond(mi(firstsex_yr, age), ., firstsex_yr <= age) 


Scott Cunningham
> I'm estimating a hazard model and had some basic questions.  The  
> dataset I'm using is the NLSY97.  It's a panel consisting of six  
> waves, and each year roughly 5500 individuals (after eliminating  
> various observations).  The outcome that I'm interested in is the  
> exit from virginity.  Individuals are not asked questions about sex  
> until they are 14, but when they are asked, they are asked at what  
> age they first experienced vaginal intercourse, and that age  
> oftentimes is prior to the year in which they were first asked about  
> their sexuality (ie, earlier than 14).  So, I have, for all  
> individuals, an integer corresponding to their age, in years, when  
> they lost their virginity, or missing data for those who are still  
> virgins.  After pulling the variables, I reshaped the data into a  
> long panel.
> Thinking about the "stset" command, I decided to follow this route.
> * generate sexually active dummy equalling 1 if sexually active, and  
> 0 otherwise
> gen sa=.
> replace sa=0 if firstsex_yr<age
> replace sa=1 if firstsex_yr==age
> replace sa=1 if firstsex_yr>age
> * stset the data
> stset age, failure(sa) id(id)
> where "age" is the age of the individual in any given year, and  
> "firstsex_yr" is the age at which the individual first experienced  
> vaginal intercourse.
> What I've basically done, though, is made the person's age to be my  
> duration variable, but I don't think this is correct.  Ideally, I'd  
> like to simply have some sort of year variable to be the duration  
> variable, but the problem I'm imaginging is how to handle 
> events that  
> happened prior to the survey.  For instance, I know that some lost  
> their virginity when they were 10, year that is at best 2 
> years prior  
> to the survey for some people, and 4 years prior to the survey for  
> others.  So, it would seem that making "age" the duration 
> variable is  
> not the appropriate strategy, but I'm not sure of a better solution  
> at this point.  Can someone provide me some suggestions on getting  
> this data together?

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index