Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: creating an index variable: number of presentations to hospital by participant

From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: creating an index variable: number of presentations to hospital by participant
Date   Thu, 28 Oct 2010 10:54:06 +0100

That is, 

bysort patient_id year (arrivaldate): gen order_in_year = _n 

A tutorial on -by:- is included within 

SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step by: step
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/02   SJ 2(1):86--102                                  (no commands)
        explains the use of the by varlist : construct to tackle
        a variety of problems with group structure, ranging from
        simple calculations for each of several groups to more
        advanced manipulations that use the built-in _n and _N

which is accessible to all via the Stata Journal website. I got the above text (and other hits) within Stata by 

. search by, sj 

-- and if you do that yourself you will get a clickable link to the corresponding .pdf. 

[email protected] 

Mitch Abdon

You may add the year variable in the -by- prefix. You need to include
the year variable in the -sort- as well. See "help by".

Alison McCarthy

> regarding the formula below. Is there a way to also split this up by year of presentation?
> This means you have the number of ordered presentations for each participant calculated separately for each year.

Kieran McCaul

>> You'll need some sort of patient ID that identifies individuals.
>> Assuming you have that, then:
>> sort patient_id arrivaldate, stable
>> by patient_id: gen order=_n
>> look at -help sort- for an explanation of the option -stable-.
>> Without this option, sort will randomly order observations with the same
>> patient_id and arrivaldate.
>> Of course, if you have the arrival time as well, you could sort on:
>> sort patient_id arrivaldate arrivaltime, stable

Alison McCarthy

>> I am working with a large data file containing presentations to a
>> hospital emergency department (ED). A number of individuals make more
>> than one presentation, which I have seen via the command 'duplicate'.
>> The organisation of my data is below:
>> 30-Mar-10 0 70 50 30-Mar-10
>> 30-Mar-10 0 70 50 30-Mar-10
>> 31-Mar-10 0 70 50 31-Mar-10
>> 01-Apr-10 0 70 50 01-Apr-10
>> As can be seen, this person presented twice to the ED on the one day;
>> the second person presenting twice in a two-day period.
>> I wish to create a new variable which tells me the number and order (in
>> terms of date) of presentations for each participant, and would greatly
>> appreciate some guidance.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index