At 09:39 PM 11/20/2003 +0000, A Delis wrote:

Dear StatalistersYou did not indicate the type of SIC. I will assume it is numeric.

I have a dataset like the following, where the SIC values are in an ascending order

SIC EXP

2011 667.8039

2013 130.8002

.

.

.

.

2095 38.90306

2098 0.617763

2111 249.92

2121 3.738577

2131 35.32281

.

.

.

2298 5.709858

2299 39.7519

.

.

etc

What I want to find is the sum of EXP values that have a SIC value begining with 20, i.e. 667.8039+130.8002+..+38.90306+0.617763

In others words I have some 4digit SIC data and I want to aggregate them into 2digit SIC.

gen byte sic_1 = int(SIC/100)

(If SIC is string, you will need to so something analogous, using substr().)

Then you aggregate by sic_1, i.e.,

egen mysum = sum(EXP), by(sic_1)

or you might want to -collapse- the data instead:

collapse (sum) EXP, by(sic_1)

Note that egen..sum will retain the original form of the data, generating a constant value within each distinct value of sic_1. -collapse- will shrink the data to one observation per distinct value of sic_1.

I hope this helps.

-- David.

David Kantor

Institute for Policy Studies

Johns Hopkins University

dkantor@jhu.edu

410-516-5404

