From
Radu Ban <rban@nber.org>

To
statalist@hsphsun2.harvard.edu

Subject
st: average value among differing numbers of variables

Date
Thu, 17 Jul 2003 00:07:05 -0400

Dear listers,

This is a data management question. The data that I'm looking at (daily U.S. weather) has the following structure.

day1 flag1 day2 flag2 day3 flag3 day4 flag4 day5 flag5 ... day31 flag31

0 s a2 a 0 s 0 s a5 a a31

0 s 0 s b3 a b4 b5 b31

c1 0 s 0 s 0 s c5 a c31

the "s" flag means that the measured element (say inches of rain) is accumulated over those days, which are assigned a 0 value, and the accumulated amount is reported in the day flagged with "a". i would like to replace the 0 value for the accumulation days with the average of the accumulated value over those days.

given the notations above, specifically, i would like to replace 0, 0, b3 (in the second row) with b3/3; 0, 0, 0, c5 (in the third row) with c5/4, and so on. note that, as in the first row there can be more than one accumulation series per row.

i figured out that each type of accumulation, a_ij(starting at day i ending at day j) must be identified, so that in the end i can use:

forval j = 2/31 {

forval i = 1/`j' {

egen daymean = rmean(day`i'-day`j') if a_`i'`j' == 1

replace day`i' = daymean

drop daymean

}

}

but i'm not sure how to define all a_ij

i hope my question is clear. any help is greatly appreciated.

thank you,

Radu Ban

