Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: Using a while loop to compare rows and delete them?


From   KLOSS <KLOSS@ifo.de>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   AW: st: Using a while loop to compare rows and delete them?
Date   Thu, 21 Jun 2012 15:09:15 +0200

Dear Nick,

Thank you for your literature hint. I don't know why I haven't thought of "by" earlier! The program is much faster now.

Kind Regards
Michael

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Nick Cox
Gesendet: Mittwoch, 20. Juni 2012 19:55
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: Using a while loop to compare rows and delete them?

I have not tried to understand your details, but my experience is that neither -while- nor -forvalues- is needed for spell problems.

I'd just like to draw your attention to previous work

SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying spells
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/07   SJ 7(2):249--265                                 (no commands)
        shows how to handle spells with complete control over
        spell specification

-tsspell- from SSC:

tsspell from http://fmwww.bc.edu/RePEc/bocode/t
    'TSSPELL': module for identification of spells or runs in time series /
    tsspell examines the data, which must be tsset time series, to / identify
    spells or runs, which are contiguous sequences defined / by some
    condition. tsspell generates new variables indicating / distinct spells,

Nick

On Wed, Jun 20, 2012 at 4:39 PM, KLOSS <KLOSS@ifo.de> wrote:

> Working with some spell data (1 row = 1 episode = 1 observation; 1 spell is subdivided into several "episodes") on employment histories I have the task to identify rows which refer to the same person and the same period (say May 5, 2002 to May 21, 2002). Let's call such observations to be "parallel". I then have to check the employment status given in these parallel observations and compare them to each other. Given some pre-defined rules, one or the other of the parallel observations should be dropped.
>
> This has to be done for all rows in the data set and for all possible combination of parallel observations.
>
> Using STATA/SE 12.0, I start with 20,014,607 rows. I then employ a while loop in order to check all these rules for all observations  (see the code below). I know: A while loop is not the fastest way to get results. However, I failed to get a forvalues loop doing the same. So, using said while loop the program has been running way too long. As I interrupted the procedure, exactly 19,999,999 rows remained in the data set.
>
> So, these are my questions:
> (1) Are the 19,999,999 rows I got just pure luck or are they a result of some limit of the while loop?
> (2) Is there any fast lane procedure available for my issue?
>
>
> My code is as follows:
>
> --- CODE START ---
>
> /*
> Data structure: A running spell is subdivided into 2 episodes at the date another spell of the same person (identified via variable "id") begins or ends. The begin and end dates of the original spell are called "begorig" and "endorig" and are written in every episode of this spell. The begin and end dates of an episode are called "begepi" and "endepi". Hence, two episodes are parallel if they show the same id-value and the same begepi-value.
> Within parallel episodes, observations are sorted as: employment (status==1) - training (status==4) - unemployed with benefit (status==5 & benefit==1) - unemployed without benefit (status==5 & benefit==0).
> */
>
> sort id begepi status benefit
>
> local i = 1 // counter
> local N = _N // number of observations
>
> while `i' <`N' {
>        local j = `i'+1
>        while `j' <=`N' {
>                if begepi[`i']!=begepi[`j'] | id[`i']!=id[`j'] { /* consider only parallel episodes */
>                        local i = `i'+1
>                        continue, break
>                }
>                if status[`i']==1 & status[`j']==4 { /* SITUATION 1 */
>                        drop in `i'
>                        local N = `N'-1
>                        continue, break
>                }
>                if status[`i']==5 & status[`j']==5 & /*
>                */ benefit[`i']==1 & benefit[`j']==0 { /* SITUATION 2 */
>                        drop in `j'
>                        local N = `N'-1
>                        continue
>                }
>                if status[`i']<=4 & status[`j']==5 & /*
>                */ begorig[`i']>=begorig[`j'] & endorig[`i']<=endorig[`j'] & /*
>                */ endorig[`i']-begorig[`i']<=14 { /* SITUATION 3 */
>                        drop in `i'
>                        local N = `N'-1
>                        continue, break
>                }
>                if status[`i']<=4 & status[`j']==5 & /*
>                */ begorig[`i']<begorig[`j'] & endorig[`i']<=endorig[`j'] & /*
>                */ endorig[`i']-begorig[`j']<=14 { /* SITUATION 4 */
>                        drop in `i'
>                        local N = `N'-1
>                        continue, break
>                }
>                if status[`i']<=4 & status[`j']==5 & /*
>                */ begorig[`i']>begorig[`j'] & endorig[`i']>=endorig[`j'] & /*
>                */ endorig[`j']-begorig[`i']<=30 { /* SITUATION 5 */
>                        drop in `j'
>                        local N = `N'-1
>                        continue
>                }
>                local j = `j'+1
>        }
> }
>
>
> --- CODE END ---
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

____________

Die ifo Niederlassung Dresden gehoert zum:

ifo Institut - Leibniz-Institut fuer Wirtschaftsforschung an der Universitaet Muenchen e.V.
Poschingerstr. 5, 81679 Muenchen, 
Sitz: Muenchen, Vereinsregister-Nr.: 4419, Amtsgericht Muenchen,
Vorstand: Prof. Dr. Dres. h.c. Hans-Werner Sinn (Praesident), Meinhard Knoche;
Steuernummer 143/217/10159, USt-IdNr. DE129516729


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index