Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: identifying re-operations from a list of operation codes and dates

 From Stephen Martin To statalist@hsphsun2.harvard.edu Subject Re: st: identifying re-operations from a list of operation codes and dates Date Mon, 9 Jul 2012 15:27:27 +0100

```Thanks Nick.

Steve

On 06/07/2012, Nick Cox <njcoxstata@gmail.com> wrote:
> There is absolutely no problem with combining -sort- and -by:-. Indeed
> that is utterly routine. But that would only be needed to do something
> else. -sort-ing by itself just requires a varlist.
>
> Here is a sandpit to play in
>
> sysuse auto, clear
> edit for rep78
> bysort foreign (rep78) : gen id = _n
> bysort foreign rep78 : gen id2 = _n
> edit for rep78 id id2
> sort rep78 foreign
> edit rep78 foreign id id2
>
> At a guess you should try
>
> bysort id op_code (op_date) : gen reop = (_n > 1) & (op_date - op_date[1]) >
> 1
>
> There is a tutorial at
>
> SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step by:
> step
>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J.
> Cox
>         Q1/02   SJ 2(1):86--102                                  (no
> commands)
>         explains the use of the by varlist : construct to tackle
>         a variety of problems with group structure, ranging from
>         simple calculations for each of several groups to more
>         advanced manipulations that use the built-in _n and _N
>
> which is accessible at the Stata Journal website.
>
>
> On Fri, Jul 6, 2012 at 10:38 PM, Stephen Martin
> <stephen.martin@york.ac.uk> wrote:
>> Thanks for this Nick.
>>
>> First a couple of clarifications:
>>
>> a) yes, oper_7 corresponds to operdate_7
>> and
>> b) each oper_ variable contains a single four digit operation code.
>>
>> Having reshaped as you suggested I thought I saw where you were
>> directing me but I tried to sort on op_date by id (to get each
>> pa7tient's operation codes in chronological order) but sort does not
>> appear to allow by (am I misunderstanding something here?).
>>
>> To illustrate the data set, here is the reshaped data for the first
>> patient.  She has four operation codes and the op_date is an elapsed
>> date.  The position variable can run from 1 to 24 but I have dropped
>> values 5 - 24 as these are empty for both op_code and op_date.
>>
>> id  position  op_code  op_date
>> 1   1            A123     18000
>> 1   2            C567     18000
>> 1   3            X678      18010
>> 1   4            B679      17996
>>
>> Any further guidance would be most welcome.
>>
>> Steve
>>
>>
>> On 06/07/2012, Nick Cox <njcoxstata@gmail.com> wrote:
>>> Paradoxically, or otherwise, there is a lot of detail to absorb here,
>>> yet you may be suppressing part of the story to keep it as simple as
>>> possible. We lose either way.
>>>
>>> Assuming that e.g. oper_7 corresponds to operdate_7 then all is not
>>> lost. But I would first
>>>
>>> reshape long oper_ operrate_ , i(id)
>>>
>>> and then clean up by renaming, dropping missing, sorting on date.
>>>
>>> But although you named these variables, I fear there are others. (In
>>> which variables are the re-operation codes?)
>>>
>>> I fear that's only a start and you may need to report back.
>>>
>>> Nick
>>>
>>> On Fri, Jul 6, 2012 at 3:22 PM, Stephen Martin
>>> <stephen.martin@york.ac.uk> wrote:
>>>
>>>> I have a dataset for patients admitted to hospital.  Each record
>>>> includes:
>>>>
>>>> a) a patient identifer;
>>>> b) 24 four digit operation procedure variables (oper_1 - oper_24;
>>>> these are four digit strings such as A148, C169, etc); and
>>>> c) 24 date of operation variables (operdate_1 - operdate24).
>>>>
>>>> Many of the operation procedure code and date variables are empty.
>>>> The operation procedure code variables are not necessarily in date
>>>> order.
>>>>
>>>> I have a list (list A) of, say, 20 four digit operation codes that can
>>>> identify whether a patient received an operation for the condition in
>>>> which I am interested.
>>>>
>>>> I also have a list (list B) of, say, 35 re-operation codes.
>>>>
>>>> I would like to identify those patients who had both the operation and
>>>> the re-operation.
>>>>
>>>> However, I cannot solely use lists A and B because some of the same
>>>> codes appear in both lists, and a re-operation must occur at least one
>>>> day later than the initial operation.
>>>>
>>>> Thus I would like to identify patients who:
>>>> (a) have an operation code from list A
>>>> and
>>>> (b) have an re-operation code from list B
>>>> and
>>>> (c) where the date of the re-operation is later than the initial
>>>> operation.
>>>>
>>>> Suggestions on how to do this would be very welcome!
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```