Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Request code simplification


From   Rebecca Pope <[email protected]>
To   [email protected]
Subject   Re: st: Re: Request code simplification
Date   Wed, 12 Jun 2013 09:26:38 -0500

Mike,
First things first, a point of order: -carryforward- is a user-written
command and you are asked to please identify these when you use them
and note from where you have obtained them.

Second, I tested your (slightly modified) code on my computer using a
test set built off of what you posted with a total of 270,000
observations (you just said "thousands", so I made it of moderate
size). Total run time was 0.79 seconds. So, if you are having computer
problems, my suggestion would be to check what else is running on your
computer.

Finally, this may be a stupid question, but if you are trying to find
patients with transplants, why aren't you searching for type_of_visit
= (code for transplant). Does the other stuff you posted serve some
greater purpose you didn't mention?

There are multiple ways of handling the code for searching for
transplants. The easiest, I think, would simply be:

*** begin example ***
gen transplant = -1*strmatch(type_of_visit,"*transplant")
bys pat_id (transplant): gen countthis = (-1*transplant) if _n==1

tab countthis
*** end example ***

The code above ran in 0.22 seconds. Timing is from a machine with 4 GB
of RAM and an Intel Core i5-2400 3.1 GHz processor. Performance on
your laptop will likely differ, but I see no reason why either
approach would cause a crash.

Regards,
Rebecca

On Tue, Jun 11, 2013 at 4:14 PM, Michael Stewart
<[email protected]> wrote:
> Hello,
>
> I am  working on a data set with thousands of patients( and multiple
> records per patient) and I am trying to identify pateints with any
> transplant(could undergo liver, kidney, intestine etc) . I was
> wondering if we could simplify my  code as my laptop is freezing with
> the following set of commands.
>
>
> Here the goal is find patients who has
> type_of_visit[1]=="first_clinic_visit" & any other type_of_visit is
> "".(after the records are sorted by pat_id and visit_date)
>
> My code :
>
> bysort PAT_ID (visit_date):carryforward VISIT,gen(v)
> bysort PAT_ID (visit_date):gen x=v[1]==v[_n]
> bysort PAT_ID (visit_date):egen z=sum(x)
> bysort PAT_ID (visit_date):egen zz=count(x)
> gen y= z< zz
>
>
> The dataset format is as follows.
>
> pat-id     visit_date     type_of_visit
> -------------------------------------------------------
> xxx      09/01/2003      first_clinic_visit
> xxx      09/15/2003
>                 .
>                 .
>                 .
>                 .
> XXX      12/12/2003
> XXX      2/04/2004       liver_transplant
> yyy      01/01/2004      first_clinic_visit
> yyy      02/02/2005
>                  .
>                   .
> yyy     01/03/2008      intestine_transplant
> zzz      05/01/2010     first_clinic_visit
> zzz     05/03/2011
>                   .
>                   .
>                   .
>                   .
> ------------------------------------------------------------
>
> As always, thanks a lot for your time
> Thank you ,
> Yours Sincerely,
> Mike.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index