[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: simple sum() question

From	Shehzad Ali <[email protected]>
To	[email protected]
Subject	Re: st: simple sum() question
Date	17 Apr 2009 13:33:33 +0100

Following on after yesterday's discussion, I have a quick follow onquestion.

Here is a quick summary of what I am doing. Each patient (varname:patient_id) was observed at 4 time points and at each time point we askedabout the clinic visits (varname: clinic) in the last 3 months. The datasetis in long form (shown below):

patient_id timepoint clinic1 1 0

1                 2           1
1                 3           .
2                 1           2
2                 2           0
3                 1           1
3                 2           .

The line below generates a sum of all clinic visits for each patient:

bysort patient_id: egen sum_clinic = sum(clinic)

Now if at one time point, clinic visit is missing (as its seen for patients1 and 3), then I want stata to return missing value for the sum. The abovecommand returns the total of the non-missing observations, ignoring themissing ones (understandably). But if I tried:


bysort patient_id: egen sum_clinic = sum(clinic) if clinic!=.

then it returns missing value for the sum variable only for the time pointwhich is missing and not for all the time points for that patient. Cananyone please suggest how to resolve this?

Secondly, whats the best way to collapse the dataset to one observation perpatient? Once I have the sum_clinic for each patient, it would be easierjust to have one observation per patient.


Thank you,
Shehzad


On Apr 15 2009, Martin Weiss wrote:

<>

Those two differ only in case you have missings...

HTH
Martin
_______________________
----- Original Message -----From: "Shehzad Ali" <[email protected]>
To: <[email protected]>
Sent: Wednesday, April 15, 2009 7:06 PM
Subject: RE: st: AW: simple sum() question
Thanks, Nick. But I am not trying to count the total number ofobservations per patient but the total number of visits (varname:clinic) across all time points for each patient (I tried to clearlystate it in the first post - sorry if I wasn't clear).
The solution I am now using is:

bysort patient_id: egen sum_clinic = sum(clinic)

Thank you,

Shehzad

On Apr 15 2009, Nick Cox wrote:
Unless there are further complications as yet unrevealed,
bysort id : gen visits = _N
is a direct and simple solution.
If you just wanted to count a subset, then
gen interesting = <binary variable defining interesting> bysortinteresting id : gen interesting_visits = _N if interesting
There are -egen- routes as well, but for problems like this going back
to basics is difficult to beat.
See also
SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step
by: step
       . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
       Q1/02   SJ 2(1):86--102                                  (no
commands)
       explains the use of the by varlist : construct to tackle
       a variety of problems with group structure, ranging from
       simple calculations for each of several groups to more
       advanced manipulations that use the built-in _n and _N

if a tutorial is needed. That's free on-line at the Stata Journal
website.
Note that even if you did want a collapsed dataset, -contract- rather
than -collapse- is more direct.
Nick [email protected]
Shehzad Ali

Hi Martin and Josiane,

Thank you for your replies. You are right that I am interested in the
total count of visits for each patient and not the running sum.
Sorry, I should have mentioned that patients who had three visits, forinstance, have three observations, and those with two visits have twoobservations. Therefore, the total number of observations for 100
patients is less than 400 (I had made up hypothetical numbers in haste to
simplify the case. Not always a good idea).
With Martin's solution, I will need to have four observations for eachpatient (sorry this was my fault as I didn't provide the correctinformation). With Josiane's suggestion, the dataset collapses which is
not what I want.

Can you suggest a modified solution please? Again, sorry for the unclear

email earlier.

On Apr 15 2009, Martin Weiss wrote:
I am betting that you want a count of visits, not a running sum, butcorrect me if I am wrong...
clear*
set obs 400
egen float patient = seq(), from(1) to(400) block(4)
egen float visit = seq(), from(1) to(4) block(1)

//not strictly necessary
xtset patient visit

//less than 4 visits for some
replace visit =. if runiform()<0.05

bys patient: egen overallvisits=count(visit)

l in 1/20, sepby(patient) noo
*************
Shehzad Ali
I have a simple question about summing across observations. I have 100patients (variable: patient_id) in the dataset, each had clinic visits(variable: clinic) and hospital visits (variable: hospital) recorded at
weeks 4, 8, 12 and 16. The dataset is long and hence I have 400observations (one observation per patient per time point).
I want to sum the clinic visits for each patient (across all 4 visits)
bearing in mind that some patients had less than 4 visits. So
effectively
I want to generate a new variable that will produce the sum of clinicvisits for each patient.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- AW: st: simple sum() question
  - From: "Martin Weiss" <[email protected]>

References:
- st: simple sum() question
  - From: Shehzad Ali <[email protected]>
- st: AW: simple sum() question
  - From: "Martin Weiss" <[email protected]>
- Re: st: AW: simple sum() question
  - From: Shehzad Ali <[email protected]>
- RE: st: AW: simple sum() question
  - From: "Nick Cox" <[email protected]>
- RE: st: AW: simple sum() question
  - From: Shehzad Ali <[email protected]>
- Re: st: AW: simple sum() question
  - From: "Martin Weiss" <[email protected]>

Prev by Date: st: Running the suest command on an unbalanced panel
Next by Date: AW: st: simple sum() question
Previous by thread: Re: st: AW: simple sum() question
Next by thread: AW: st: simple sum() question
Index(es):
- Date
- Thread