Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Using a while loop to compare rows and delete them? |

Date |
Wed, 20 Jun 2012 18:55:17 +0100 |

I have not tried to understand your details, but my experience is that neither -while- nor -forvalues- is needed for spell problems. I'd just like to draw your attention to previous work SJ-7-2 dm0029 . . . . . . . . . . . . . . Speaking Stata: Identifying spells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q2/07 SJ 7(2):249--265 (no commands) shows how to handle spells with complete control over spell specification -tsspell- from SSC: tsspell from http://fmwww.bc.edu/RePEc/bocode/t 'TSSPELL': module for identification of spells or runs in time series / tsspell examines the data, which must be tsset time series, to / identify spells or runs, which are contiguous sequences defined / by some condition. tsspell generates new variables indicating / distinct spells, Nick On Wed, Jun 20, 2012 at 4:39 PM, KLOSS <KLOSS@ifo.de> wrote: > Working with some spell data (1 row = 1 episode = 1 observation; 1 spell is subdivided into several "episodes") on employment histories I have the task to identify rows which refer to the same person and the same period (say May 5, 2002 to May 21, 2002). Let's call such observations to be "parallel". I then have to check the employment status given in these parallel observations and compare them to each other. Given some pre-defined rules, one or the other of the parallel observations should be dropped. > > This has to be done for all rows in the data set and for all possible combination of parallel observations. > > Using STATA/SE 12.0, I start with 20,014,607 rows. I then employ a while loop in order to check all these rules for all observations (see the code below). I know: A while loop is not the fastest way to get results. However, I failed to get a forvalues loop doing the same. So, using said while loop the program has been running way too long. As I interrupted the procedure, exactly 19,999,999 rows remained in the data set. > > So, these are my questions: > (1) Are the 19,999,999 rows I got just pure luck or are they a result of some limit of the while loop? > (2) Is there any fast lane procedure available for my issue? > > > My code is as follows: > > --- CODE START --- > > /* > Data structure: A running spell is subdivided into 2 episodes at the date another spell of the same person (identified via variable "id") begins or ends. The begin and end dates of the original spell are called "begorig" and "endorig" and are written in every episode of this spell. The begin and end dates of an episode are called "begepi" and "endepi". Hence, two episodes are parallel if they show the same id-value and the same begepi-value. > Within parallel episodes, observations are sorted as: employment (status==1) - training (status==4) - unemployed with benefit (status==5 & benefit==1) - unemployed without benefit (status==5 & benefit==0). > */ > > sort id begepi status benefit > > local i = 1 // counter > local N = _N // number of observations > > while `i' <`N' { > local j = `i'+1 > while `j' <=`N' { > if begepi[`i']!=begepi[`j'] | id[`i']!=id[`j'] { /* consider only parallel episodes */ > local i = `i'+1 > continue, break > } > if status[`i']==1 & status[`j']==4 { /* SITUATION 1 */ > drop in `i' > local N = `N'-1 > continue, break > } > if status[`i']==5 & status[`j']==5 & /* > */ benefit[`i']==1 & benefit[`j']==0 { /* SITUATION 2 */ > drop in `j' > local N = `N'-1 > continue > } > if status[`i']<=4 & status[`j']==5 & /* > */ begorig[`i']>=begorig[`j'] & endorig[`i']<=endorig[`j'] & /* > */ endorig[`i']-begorig[`i']<=14 { /* SITUATION 3 */ > drop in `i' > local N = `N'-1 > continue, break > } > if status[`i']<=4 & status[`j']==5 & /* > */ begorig[`i']<begorig[`j'] & endorig[`i']<=endorig[`j'] & /* > */ endorig[`i']-begorig[`j']<=14 { /* SITUATION 4 */ > drop in `i' > local N = `N'-1 > continue, break > } > if status[`i']<=4 & status[`j']==5 & /* > */ begorig[`i']>begorig[`j'] & endorig[`i']>=endorig[`j'] & /* > */ endorig[`j']-begorig[`i']<=30 { /* SITUATION 5 */ > drop in `j' > local N = `N'-1 > continue > } > local j = `j'+1 > } > } > > > --- CODE END --- > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**AW: st: Using a while loop to compare rows and delete them?***From:*KLOSS <KLOSS@ifo.de>

**References**:**st: Using a while loop to compare rows and delete them?***From:*KLOSS <KLOSS@ifo.de>

- Prev by Date:
**st: Borrowed Mata code from an issue of Stata journal...but from which issue, exactly?** - Next by Date:
**st: xtivreg2: Endogenous variables** - Previous by thread:
**AW: st: Using a while loop to compare rows and delete them?** - Next by thread:
**AW: st: Using a while loop to compare rows and delete them?** - Index(es):