# Re: st: AW: simple sum() question

 From "Martin Weiss"
Subject Re: st: AW: simple sum() question
Date Wed, 15 Apr 2009 19:26:22 +0200

Those two differ only in case you have missings...

HTH
Martin
----- Original Message -----
From: "Shehzad Ali" <sia500@york.ac.uk>
Sent: Wednesday, April 15, 2009 7:06 PM
Sent: Wednesday, April 15, 2009 7:06 PM
Subject: RE: st: AW: simple sum() question

Thanks, Nick. But I am not trying to count the total number of observations per patient but the total number of visits (varname: clinic) across all time points for each patient (I tried to clearly state it in the first post - sorry if I wasn't clear).
The solution I am now using is:

bysort patient_id: egen sum_clinic = sum(clinic)

Thank you,

Shehzad

On Apr 15 2009, Nick Cox wrote:

```Unless there are further complications as yet unrevealed,
bysort id : gen visits = _N
is a direct and simple solution.
If you just wanted to count a subset, then
gen interesting = <binary variable defining interesting> bysort interesting id : gen interesting_visits = _N if interesting
```There are -egen- routes as well, but for problems like this going back
to basics is difficult to beat.
See also
SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step
by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
Q1/02   SJ 2(1):86--102                                  (no
commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N

if a tutorial is needed. That's free on-line at the Stata Journal
website.
Note that even if you did want a collapsed dataset, -contract- rather
than -collapse- is more direct.
Nick n.j.cox@durham.ac.uk
Shehzad Ali

Hi Martin and Josiane,

Thank you for your replies. You are right that I am interested in the
total count of visits for each patient and not the running sum.

Sorry, I should have mentioned that patients who had three visits, for instance, have three observations, and those with two visits have two observations. Therefore, the total number of observations for 100
```patients is less than 400 (I had made up hypothetical numbers in haste to
simplify the case. Not always a good idea).

With Martin's solution, I will need to have four observations for each patient (sorry this was my fault as I didn't provide the correct information). With Josiane's suggestion, the dataset collapses which is
```not what I want.

Can you suggest a modified solution please? Again, sorry for the unclear

email earlier.

On Apr 15 2009, Martin Weiss wrote:

I am betting that you want a count of visits, not a running sum, but correct me if I am wrong...
```clear*
set obs 400
egen float patient = seq(), from(1) to(400) block(4)
egen float visit = seq(), from(1) to(4) block(1)

//not strictly necessary
xtset patient visit

//less than 4 visits for some
replace visit =. if runiform()<0.05

bys patient: egen overallvisits=count(visit)

l in 1/20, sepby(patient) noo
*************
Shehzad Ali

I have a simple question about summing across observations. I have 100 patients (variable: patient_id) in the dataset, each had clinic visits (variable: clinic) and hospital visits (variable: hospital) recorded at
weeks 4, 8, 12 and 16. The dataset is long and hence I have 400 observations (one observation per patient per time point).
```
I want to sum the clinic visits for each patient (across all 4 visits)
```bearing in mind that some patients had less than 4 visits. So
```effectively
I want to generate a new variable that will produce the sum of clinic visits for each patient.
