Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Analysis of event history data |

Date |
Tue, 20 Mar 2012 14:14:15 +0000 |

Never say "one final question"! -help egen- shows that there are -egen- functions -anycount()-, -anymatch()-. -anyvalue()-. So egen ones = anycount(y_*), values(1) keep if ones Even if those functions did not exist, you could do this gen ones = 0 quietly foreach v of var y_* { replace ones = ones + (`v' == 1) } keep if ones Nick On Tue, Mar 20, 2012 at 1:28 PM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote: > Thanks again, Nick. I figured it out with your help. But I have one final question. Given that my dataset consists of several million observations, I would like to trim the dataset down before I do the -reshape- command in order to avoid wasting time on observations that I would subsequently throw out. Say that I want to keep those observations where y_* is equal to 1 in one or more cases: > > Id y_1001 y_1002 y_1003 ... y_1101 area_10 area_11 > 1 1 1 0 1 10 5 > > I guess I could do the following: > > keep if y_1001==1| y_1002==1 etc. > > But given that I have around 1000 variables or so where I would need to check for the sufficient condition that would be a quite tedious function. Is there a smart way to get around this? Nick Cox > Do spend some time studying the resources for -reshape- including FAQs. > > First off, your -y_- cannot be an identifier! It doesn't identify observations. > > Second off, you can include -area- in the -reshape- but I guess you will need some extra surgery before and after. I would try a -rename- of the -area*- such as > > foreach v of var area* { > rename `v' `v'01 > } > > and then there will be some fill-in afterwards. > > Nick > > On Mon, Mar 19, 2012 at 12:30 PM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote: >> Thanks, Nick. -reshape- is a big help. But what if I have time-varying variables that I would like to carry over as well, but not with same intervals. For example: >> >> Id y_1001 y_1002 y_1003 ... y_1101 area_10 >> area_11 >> 1 1 1 0 0 10 5 >> >> If I do -reshape using y_ as the identifier I would get something like: >> >> Id j y_ area_10 area_11 >> 1 1001 1 10 5 >> 1 1002 1 10 5 >> 1 1003 0 10 5 >> . >> . >> .1 1101 0 10 5 >> >> But I would like to have something like: >> >> Id j y_ area >> 1 1001 1 10 >> 1 1002 1 10 >> 1 1003 0 10 >> . >> . >> . >> 1 1101 0 5 >> >> Is that possible with -reshape-? Or would I have to convert the yearly time-varying variables into weekly first? >> >> Thanks again, >> Kristian >> >> -----Oprindelig meddelelse----- >> Fra: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af Nick Cox >> Sendt: 19. marts 2012 12:43 >> Til: statalist@hsphsun2.harvard.edu >> Emne: Re: st: Analysis of event history data >> >> For most Stata purposes your data would indeed be better reshaped to a long data structure or shape or form (some people do say "format", but in a Stata context format implies -format-, etc.). >> >> reshape long y_ , i(id) j(time) >> rename y_ status >> >> should do it. See also -tsspell- (SSC) and >> >> SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: >> Identifying spells >> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. >> J. Cox >> Q2/07 SJ 7(2):249--265 (no >> commands) >> shows how to handle spells with complete control over >> spell specification >> >> as well as the literature on survival analysis with which you are evidently familiar. >> >> Nick >> >> On Mon, Mar 19, 2012 at 11:32 AM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote: >> >>> I am trying to do an analysis of transition in and out of public >>> income transfers. My data is organized roughly the following way: >>> >>> Id y_1001 y_1002 y_1003 >>> 1 0 1 0 >>> 2 0 0 0 >>> 3 1 1 0 >>> >>> This means that I have the weekly status of each individual from 1991 >>> to 2011. But in order to any sort of analysis I would guess that I >>> had to convert the data into the following way instead (for example >>> survival >>> analysis): >>> >>> Id Status Time >>> 1 0 1 >>> 1 1 2 >>> 1 0 3 >>> 2 0 1 >>> 2 0 2 >>> 2 0 3 >>> 3 1 1 >>> 3 1 2 >>> 3 0 3 >>> >>> Is that correct, and if so, does there exist a smart way to convert >>> the data from one format into the other? Or can I perhaps use the >>> data as given? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**SV: st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

**References**:**st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

**Re: st: Analysis of event history data***From:*Nick Cox <njcoxstata@gmail.com>

**SV: st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

**Re: st: Analysis of event history data***From:*Nick Cox <njcoxstata@gmail.com>

**SV: st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

- Prev by Date:
**Re: st: RE: ivreg2 questions** - Next by Date:
**Re: st: Re: st: the symbols š č ć ž** - Previous by thread:
**SV: st: Analysis of event history data** - Next by thread:
**SV: st: Analysis of event history data** - Index(es):