Hi
I wanted to ask how do I sum group of variables into a single variable
in a large data set?
Suppose I have list of variables as follows:
diag3201
diag4201
diag8203
diag9201
diag9202
diag9203
diag9204
diag10214
diag10218
diag11201
diag98201
diag98202
diag98204
diag98205
diag98206
diag98207
diag100201
diag100202
diag100203
diag100204
diag100205
diag100206
diag100207
diag100208
diag100209
diag100210
diag100211
diag100212
diag100213
diag100214
diag100215
diag100216
diag100217
diag100218
diag100219
diag100220
diag100221
The last three digits for each variable are the same; starts with 201
and end with 222 for some of them. The first digits indicate a
diagnosis. For instance, for diagnosis 98 we have variables diag98201
till diag98207, for diagnosis 9 we have variables diag9201-diag9204. I
want to generate new variables for the diagnoses that equal to the sum
of the variables that relate to this diagnosis. For example I want to
generate diag98=diag98201+ diag98202+ …+ diag98207 or diag9=diag9201+
diag9202+..diag9207.
I know how to do it for one diagnosis at a time, I use egen diag98=
rsum(diag98201-diag98207), but I need to do it for a large number of
diagnoses so how can I do it for all of them?
Thanks
David
p.s. I have stata8
David Messika
The Health Economics Unit
The Gertner Institute for Epidemiology and Health Policy Research
Chaim Sheba Medical Center
Ramat-Gan, Israel
(: 972-3-5303935 (: 972-3-5303277
