# st: cycling through individual indices

```Dear Statalisters,

I have Stata 10.1 IC and I try to create individual specific sums in a
large dataset. The problem is a bit complicated and I have to cycle
through all individuals and variables using the "in" qualifier. I am
curious if anyone has an idea how to solve this problem more efficiently.
Here is the problem:

The data are in wide format and look like

ID      Agemonth        Var_1   Var_2...                ...Var_623 Var_624
1       532             2       2                       14      14
2       345             7       7                       14      Mis
3       236             3       3                       Mis     Mis
4       267             2       2                       12      12

and so forth; there are about 50,000 observations. "Agemonth" indicates
the observation period which is individual specific: "1" means January of
the year after the person turned 14, "2" is February and so forth. That
means e.g. "ID" 1 was observed 532 months after the year he/she turned 14.
The index of the variables indicate the same time index. Thus, person 1
was observed from Var_1 until Var_532. Unfortunately, that does not mean
that Var_533 or even Var_623 is missing but it may have a value like in
the example above.

Var_# has a number of distinct values and I need to sum them up in each
case. If I had no invalid observations I could type

egen sum1 = anycount(Var_*), values(1)

However, then I count also invalid observations.

I ended up with looping through individuals (~50,000) and variables (624),
summing up one by one but I really doubt that this is the "best" solution
(and hope that it is not):

*******************
#d;

gen sum1 = 0;
sort ID;
gen index = _n;
qui sum index;

forvalues indis = `r(min)'/`r(max)' {;

di "`indis'";

forvalues f = 1/624 {;

if `f' <=Agemonth in `indis' {;
qui replace sum1 = sum1 + (Var_`f' == 1) in
`indis';
};

};
};
*******************

Another possibilty would be to have the data in long format - however,
since I have so many periods it takes a while to reshape the data, even in
portions. I tried that with a 10% sample and "reshape" took more than one
hour (maybe I have to ask for a better computer...).

Any help would be appreciated!

Thank you,

Johannes

```