Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Kristian Thor Jakobsen" <KRJ@dm.dk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
SV: st: Analysis of event history data |

Date |
Tue, 20 Mar 2012 14:28:42 +0100 |

Thanks again, Nick. I figured it out with your help. But I have one final question. Given that my dataset consists of several million observations, I would like to trim the dataset down before I do the -reshape- command in order to avoid wasting time on observations that I would subsequently throw out. Say that I want to keep those observations where y_* is equal to 1 in one or more cases: Id y_1001 y_1002 y_1003 ... y_1101 area_10 area_11 1 1 1 0 1 10 5 I guess I could do the following: keep if y_1001==1| y_1002==1 etc. But given that I have around 1000 variables or so where I would need to check for the sufficient condition that would be a quite tedious function. Is there a smart way to get around this? Thanks again, Kristian -----Oprindelig meddelelse----- Fra: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af Nick Cox Sendt: 19. marts 2012 13:46 Til: statalist@hsphsun2.harvard.edu Emne: Re: st: Analysis of event history data Do spend some time studying the resources for -reshape- including FAQs. First off, your -y_- cannot be an identifier! It doesn't identify observations. Second off, you can include -area- in the -reshape- but I guess you will need some extra surgery before and after. I would try a -rename- of the -area*- such as foreach v of var area* { rename `v' `v'01 } and then there will be some fill-in afterwards. Nick On Mon, Mar 19, 2012 at 12:30 PM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote: > Thanks, Nick. -reshape- is a big help. But what if I have time-varying variables that I would like to carry over as well, but not with same intervals. For example: > > Id y_1001 y_1002 y_1003 ... y_1101 area_10 > area_11 > 1 1 1 0 0 10 5 > > If I do -reshape using y_ as the identifier I would get something like: > > Id j y_ area_10 area_11 > 1 1001 1 10 5 > 1 1002 1 10 5 > 1 1003 0 10 5 > . > . > .1 1101 0 10 5 > > But I would like to have something like: > > Id j y_ area > 1 1001 1 10 > 1 1002 1 10 > 1 1003 0 10 > . > . > . > 1 1101 0 5 > > Is that possible with -reshape-? Or would I have to convert the yearly time-varying variables into weekly first? > > Thanks again, > Kristian > > -----Oprindelig meddelelse----- > Fra: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af Nick Cox > Sendt: 19. marts 2012 12:43 > Til: statalist@hsphsun2.harvard.edu > Emne: Re: st: Analysis of event history data > > For most Stata purposes your data would indeed be better reshaped to a long data structure or shape or form (some people do say "format", but in a Stata context format implies -format-, etc.). > > reshape long y_ , i(id) j(time) > rename y_ status > > should do it. See also -tsspell- (SSC) and > > SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: > Identifying spells > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. > J. Cox > Q2/07 SJ 7(2):249--265 (no > commands) > shows how to handle spells with complete control over > spell specification > > as well as the literature on survival analysis with which you are evidently familiar. > > Nick > > On Mon, Mar 19, 2012 at 11:32 AM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote: > >> I am trying to do an analysis of transition in and out of public >> income transfers. My data is organized roughly the following way: >> >> Id y_1001 y_1002 y_1003 >> 1 0 1 0 >> 2 0 0 0 >> 3 1 1 0 >> >> This means that I have the weekly status of each individual from 1991 >> to 2011. But in order to any sort of analysis I would guess that I >> had to convert the data into the following way instead (for example >> survival >> analysis): >> >> Id Status Time >> 1 0 1 >> 1 1 2 >> 1 0 3 >> 2 0 1 >> 2 0 2 >> 2 0 3 >> 3 1 1 >> 3 1 2 >> 3 0 3 >> >> Is that correct, and if so, does there exist a smart way to convert >> the data from one format into the other? Or can I perhaps use the >> data as given? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Analysis of event history data***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

**Re: st: Analysis of event history data***From:*Nick Cox <njcoxstata@gmail.com>

**SV: st: Analysis of event history data***From:*"Kristian Thor Jakobsen" <KRJ@dm.dk>

**Re: st: Analysis of event history data***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Bootstrapping a vector, no results?** - Next by Date:
**Re: st: gologit2 for ordered three level dependent variable** - Previous by thread:
**Re: st: Analysis of event history data** - Next by thread:
**Re: st: Analysis of event history data** - Index(es):