Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: creating a cumulative variable


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: creating a cumulative variable
Date   Tue, 9 Jul 2002 19:28:15 +0100

Sarah Mustillo

> I have a longitudinal data set in long form with annual obs
> for 6 years on 1450 kids.  I am trying to create a cumulative variable
> such that the value for age 10 is the sum of the values at age 9 and age
10, the value
> for age 11 is the sum of the value for 9, 10, & 11 and so on.  I can
figure out
> how to do this in wide form, but I am going to have to repeat it for about
50
> different variables. It would be much easier if I could just keep the
> data in long form.  I experimented with _n, and by id age, but
> couldn't get what I wanted (e.g., for age 11 I could get the sum of age 10
(_n-1) and
> age 11, but not age 9, 10, &11).

My advice is definitely to try to keep the data long. In Stata, most things
are
easier done long. There are exceptions, as when what you want is
provided by some -egen, r*()- function, but for most longitudinal
stuff, long is better in my experience.

You have an identifier for each child -id-, an -age-, and, generically,
a -response-.

It sounds as if you need just the result of -sum()-, which gives
the cumulative sum. So we need to sort on -id- and then within
-id- on -age-.

bysort id (age) : gen Cresponse = sum(response)

-bysort id- ensures that we do this separately for each -id-.

-bysort id (age)- ensures that we do it separately and in the
right age order.

> Once I create this new variable, is there an easy way to repeat the same
> thing for 50 different variables?  I usually just copy and paste in my do
> file, changing variable names, but lately the do files have gotten rather
long.  It seems
> there is probably an easy way to have it do the same thing over and over
for
> different variables, but I can't figure it out.

We just need to -sort- once

sort id age

and then use -foreach- to cycle through a varlist

foreach v of var <varlist> {
	by id : gen C`v' = sum(`v')
}

You must plug in your <varlist>. It can be in abbreviated form,
using * ? and/or -, etc.

Naturally, you can use your own naming convention, but
this is usually easiest with just some short prefix or suffix
added to the original variable name to give a new name.

This particular problem can also be done with -for-.
My main reservation about -for- is that it doesn't
grow gracefully when extended to more complicated
problems, whereas -foreach- typically does.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index