[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Marcello Pagano <pagano@hsph.harvard.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: stset and the NLSY97 |

Date |
Mon, 17 Oct 2005 08:11:51 -0400 |

I stand corrected as far as the multiple observations on each individual; I missed that. The rest is still valid if you collapse your data (or clone it, in the other direction, if you need to keep the multiplicities). m.p. Lars Kroll wrote:

I don't think your on the right way Marcello,

id year age firstsex_yr

are single failure multiple obs per subject data, so I would suggest:1 1997 15 .

1 1998 16 16

1 1999 17 16

1 2000 18 16

2 1997 12 .

2 1998 13 .

2 1999 14 11

2 2000 15 11

3 1997 16 12

3 1998 17 12

3 1999 18 12

3 2000 19 12

gen failure = age==firstsex if firstsex<.

sort persnr year

by persnr, sort : gen enterstudy = year==year[1] // if your first year

// isn't 0 one never

// know...

stset age, id(id) failure(failure==1) exit(failure==1)

enter(enterstudy==1)

Hope this helps,

Lars

Am Sonntag, den 16.10.2005, 21:46 -0400 schrieb Marcello Pagano:

I do not understand your dilemma. Assuming everyone is telling the truth,

what you seem to have is time to first sex is your outcome of interest with

the very Victorian identification of "death" as that time. If someone is 17

at the time of the survey without having had sex, then that is a censored

observation. So your "time" variable is firstsex_yr if sa==1

and age if sa==0. So you need to generate a variable

gen time = age

replace time = firstsex_yr if sa==1

stset time , failure(sa)

Your hazard should be zero for time < = 10, but that

depends on your data. You actually do have information back then, assuming

you have done a decent job of sampling and things have not changed

that much over the years. (By that I mean that if everyone you question

is over 12, say, then their experience in the 0 to 12 time period is

still representative of what is going in those years today.)

Hope this helps,

m.p.

Scott Cunningham wrote:

On Oct 16, 2005, at 9:01 PM, Nick Cox wrote:*

Not my field, but your dummy calculation canNick,

be put more succinctly:

gen sa = firstsex_yr <= age

However, safer would be to trap missings:

gem sa = cond(mi(firstsex_yr, age), ., firstsex_yr <= age)

Nick

Thanks for helping make the dummies more succinct.

Do you think, though, that it is correct to use "age" as the actual duration variable? So, for instance, I have a long dataset like this:

id year age firstsex_yr

1 1997 15 .

1 1998 16 16

1 1999 17 16

1 2000 18 16

2 1997 12 .

2 1998 13 .

2 1999 14 11

2 2000 15 11

3 1997 16 12

3 1998 17 12

3 1999 18 12

3 2000 19 12

So, by stsetting the data as so:

. stset age, failure(sa)

where "sa" is an indicator equalling "1" if the person has become sexually active (signalling "death" in this context) and 0 otherwise. If I stset the data such that "age" is the duration, have I really made the right decision? Or should I use "year" or should have some other variable that I create to correspond to time that has passed? Because I really want to look at ten periods, initially - from 10 years to 19 years of age. It's a short duration, relatively speaking, and most "exits" occur at 15-17. So I don't actually have data for resopndents for those early, pre-survey, ages - ie, 10-12. So what's the best solution here? Do I create a variable, maybe "time" or "virgin_time", that takes on a value of 1 to 10, and that variable matches up to the years that are covered in the data, and the years not covered?

Is this post making sense? I'm mainly just not sure of the proper way to execute this stset command to make use of the information I have in the form I currently have it in.

scott

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: stset and the NLSY97***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: stset and the NLSY97***From:*Scott Cunningham <scunning@gmail.com>

**Re: st: RE: stset and the NLSY97***From:*Marcello Pagano <pagano@hsph.harvard.edu>

**Re: st: RE: stset and the NLSY97***From:*Lars Kroll <lars@climweb.de>

- Prev by Date:
**st: Gini coefficient for summarized (pre-processed) data** - Next by Date:
**RE: st: missing values with outsheet/outfile** - Previous by thread:
**Re: st: RE: stset and the NLSY97** - Next by thread:
**Re: st: RE: stset and the NLSY97** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |