# Re: st: time-series data identified by three variables

 From Nick Cox <[email protected]> To [email protected] Subject Re: st: time-series data identified by three variables Date Wed, 28 Nov 2012 09:51:34 +0000

```Yes, as that is what you want.

-by:- is well documented. There are several sections in [U] (use
Index), and I wrote a tutorial as a supplement that is accessible, so

SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
Q1/02   SJ 2(1):86--102                                  (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N

See http://www.stata-journal.com/sjpdf.html?articlenum=pr0004

On Wed, Nov 28, 2012 at 9:44 AM, YANNAN SHEN <[email protected]> wrote:
> Dear Nick，
> For the first code do you mean
>> bysort illness_id date of visit : egen meansev = mean(severity)
> Because all I care is on a certain date, what is the average severity across all patient visited for a certain disease.
> For instance if date=June5; day=5, month=june
> Is bysort month day equivalent to bysort date?
>
>> You want commands like
>> bysort patient_id illness_id date of visit : egen meansev = mean(severity)
>> by patient_id illness_id : gen repeat = _n - 1
>> as you want to number 0 upwards.
>>
>> Nick
>> On Wed, Nov 28, 2012 at 6:28 AM, yannan shen <[email protected]> wrote:
>>
>>> I am working some panel data of hospital visits and I want to learn
>>> the severity of various disease.
>>> The variables I have in the dataset are: patient_id, illness_id,
>>> date_of_visit, severity
>>> each observation contains: patient_id, illness_id, date_of_visit, severity.
>>> For each patient (identified by patient_id), I want to know how many
>>> of times he has visited for the same illness （illness_id ).
>>> I use the duple command to to label the observation of patients who
>>> have visited hospital more than once.
>>>> duplicates tag  patient_id illness_id , generate(duple)
>>>
>>> However, duple does not give information for any time series
>>> information. If a patient has 5 visiting records, I want to be able to
>>> know which is the 0th repeat, 1st repeat, 2nd repeat, 3rd repeat, and
>>> 4th repeat...I have a vague feeling that I can order those variables
>>> via date_of_visit but I am still not sure how exactly that can be
>>> done.
>>> Furthermore, I want to create two new variables: one variable equals
>>> to the average severity of each disease (disease_id) being treated on
>>> the same date_of_visit. The other variable equals the highest severity
>>> of a certain disease being treated on that day. (Ideally, I want to
>>> create additional variables for each observation)
>>> I have used “bysort” in the past but since now the type is a
>>> combination of illness_id and date_of_visit， I am a little confused.

