Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Handling pharmacy data with multiple entries per subject

From   Phil Schumm <>
Subject   Re: st: Handling pharmacy data with multiple entries per subject
Date   Fri, 10 Jun 2011 16:00:21 -0500

On Jun 10, 2011, at 2:59 PM, Doernberg, Sarah wrote:
I have a dataset from our pharmacy with prescriptions for antibiotics in hospitalized patients. Each time a patient was transferred (from the emergency department to the ward or the ward to the ICU, for instance), a new prescription (and thus, a new row) was generated. This is compounded by the fact that some people received intermittent dosing (each start date with it's own row).

Because this is a very large set of data, I am trying to figure out how to have Stata combine the rows. Ideally, I would like to have one entry per person with consecutive courses of antibiotics represented by start and stop days (for example, someone who received an antibiotic from 6/1-6/3 and 6/7-6/9 would have start_date_1 = 6/1, stop_date_1=6/3 and start_date_2=6/7 and stop_date_2=6/9).

I have tried doing this with the collapse command but the best I can do is to get total days on antibiotic in a given month. Converting from long to wide also is not ideal because consecutive courses are not combined due to the multiple prescriptions based on location.

You'll have to be a bit more specific here to get the help you're asking for. For example, why do you want

    start_date_1   stop_date_1   start_date_2   stop_date_2
    ------------   -----------   ------------   -----------
        6/1            6/3           6/7            6/9

I'm guessing (but I could be wrong) that your next step after this will be to do some further calculations, which can probably be done more easily with the data in the original, long format. Also, if you want help with the code to translate between what you have now and the layout above, then you need to show the actual layout of the current dataset. Otherwise, people will just guess, and the whole exercise becomes quite inefficient.

Don't be put off by this -- I do calculations like this all the time in Stata, and it is very easy to do once you know how. So chances are, this is definitely worth persisting with.

-- Phil

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index