Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: How can I find the sum of specific values of a variable?


From   David Kantor <dkantor@jhu.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: How can I find the sum of specific values of a variable?
Date   Thu, 20 Nov 2003 17:54:36 -0500

At 09:39 PM 11/20/2003 +0000, A Delis wrote:
Dear Statalisters

I have a dataset like the following, where the SIC values are in an ascending order
SIC EXP
2011 667.8039
2013 130.8002
.
.
.
.

2095 38.90306
2098 0.617763
2111 249.92
2121 3.738577
2131 35.32281
.
.
.
2298 5.709858
2299 39.7519
.
.
etc

What I want to find is the sum of EXP values that have a SIC value begining with 20, i.e. 667.8039+130.8002+..+38.90306+0.617763
In others words I have some 4digit SIC data and I want to aggregate them into 2digit SIC.

You did not indicate the type of SIC. I will assume it is numeric.

gen byte sic_1 = int(SIC/100)

(If SIC is string, you will need to so something analogous, using substr().)


Then you aggregate by sic_1, i.e.,

egen mysum = sum(EXP), by(sic_1)

or you might want to -collapse- the data instead:

collapse (sum) EXP, by(sic_1)

Note that egen..sum will retain the original form of the data, generating a constant value within each distinct value of sic_1. -collapse- will shrink the data to one observation per distinct value of sic_1.

I hope this helps.
-- David.


David Kantor
Institute for Policy Studies
Johns Hopkins University
dkantor@jhu.edu
410-516-5404

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index