Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Rebecca Pope <rebecca.a.pope@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Number of people present by date and time |

Date |
Thu, 29 Nov 2012 13:37:12 -0600 |

Nick Cox wrote: "The number in the clinic should be zero when the clinic is closed." As a geographer, Nick can be forgiven for having too high an opinion of the efficiency of health care operations. I would not necessarily toss observations out if patients are still in clinic when it is technically closed, at least within some tolerance window. I'll speak just from experience with U.S. health system data because that is all I've analyzed. Clinic hours are usually given as, e.g. 7:30 am to 4 pm, which means they do not schedule patients for appointments that start after 3:45 pm and are planned to end by 4 pm. One sees immediately the potential problems with this as one small delay means that patients may still be in the waiting room at 4 or even well beyond. Might be another interesting research question for Simon... Rebecca On Thu, Nov 29, 2012 at 1:04 PM, Nick Cox <njcoxstata@gmail.com> wrote: > > Should be easier than I implied. Even if a unique identifier doesn't > exist for each observation, you just create one. For a _big_ dataset, > be careful on variable type. > > I am assuming that -arrival- and -depart- are Stata date-times. > > gen long obsid = _n > expand 2 > bysort obsid : gen inout = cond(_n == 1, 1, -1) > by obsid : gen double time = cond(_n == 1, arrival, depart) > sort time > gen present = sum(inout) > > Two simple checks on logic and data quality > > 1. The number in the clinic should never be negative. > > 2. The number in the clinic should be zero when the clinic is closed. > > Nick > > On Thu, Nov 29, 2012 at 2:01 PM, Nick Cox <njcoxstata@gmail.com> wrote: > > Each observation is, I gather, a patient. One technique is to make > > each observation an arrival or departure. For a very simple toy > > dataset with just times for one day: > > > > . l > > > > +-----------------------+ > > | arrival depart id | > > |-----------------------| > > 1. | 1000 1100 1 | > > 2. | 1030 1200 2 | > > 3. | 1230 1300 3 | > > +-----------------------+ > > > > . expand 2 > > (3 observations created) > > > > . bysort id : gen inout = cond(_n == 1, 1, -1) > > > > . by id : gen time = cond(_n == 1, arrival, depart) > > > > . sort time > > > > . l > > > > +--------------------------------------+ > > | arrival depart id inout time | > > |--------------------------------------| > > 1. | 1000 1100 1 1 1000 | > > 2. | 1030 1200 2 1 1030 | > > 3. | 1000 1100 1 -1 1100 | > > 4. | 1030 1200 2 -1 1200 | > > 5. | 1230 1300 3 1 1230 | > > |--------------------------------------| > > 6. | 1230 1300 3 -1 1300 | > > +--------------------------------------+ > > > > . gen present = sum(inout) > > > > . l, sep(0) > > > > +------------------------------------------------+ > > | arrival depart id inout time present | > > |------------------------------------------------| > > 1. | 1000 1100 1 1 1000 1 | > > 2. | 1030 1200 2 1 1030 2 | > > 3. | 1000 1100 1 -1 1100 1 | > > 4. | 1030 1200 2 -1 1200 0 | > > 5. | 1230 1300 3 1 1230 1 | > > 6. | 1230 1300 3 -1 1300 0 | > > +------------------------------------------------+ > > > > This is only one trick, and others will depend on your data. For > > example, if your clinic is only open daily, you may be able to, or > > need to, exploit that. If patients can come to a clinic more than once > > a day that will provide a complication. > > > > All told, you should not need loops here. The two keys are likely to > > be (1) the best data structure (2) heavy use of -by:-. > > > > Nick > > > > On Thu, Nov 29, 2012 at 1:35 PM, Simon <scmoore.lists@googlemail.com> wrote: > > >> This is quite possible a rather naive question, but for some reason I am > >> stuck. > >> > >> I have data from a clinic. I have the time each patient checks in > >> (arrdatetime), the time they leave (depdatetime) and the time taken to > >> first consultation (waittime) in minutes. What I would like to do is > >> compare the number of people in the clinic for each patient at > >> arrdatetime with waittime. > >> > >> So far the best I can come up with is to write a loop, going through > >> every patients' arrdatetime and counting up those whose arrival and > >> departure times span this value. But I have rather a lot of data and > >> this seems terribly inefficient. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Number of people present by date and time***From:*Simon <scmoore.lists@googlemail.com>

**Re: st: Number of people present by date and time***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Number of people present by date and time***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: question simultaneous probit equations** - Next by Date:
**st: Labeling series within the plot area** - Previous by thread:
**Re: st: Number of people present by date and time** - Next by thread:
**st: joint significance of categorical var** - Index(es):