Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sum over variables for determinate observations


From   Roberto Ferrer <[email protected]>
To   Stata Help <[email protected]>
Subject   Re: st: sum over variables for determinate observations
Date   Sun, 26 Jan 2014 13:31:46 -0430

Alternatives are:

/*
Use -egen, total()-, to compute totals and keep an arbitrary observation
(here the first one).
*/

bysort provname atecosec: egen snumcontrib = total(numcontrib)
by provname atecosec: keep if _n == 1


/*
Use -sum- to compute a cumulative sum and keep the last observation
*/

bysort provname atecosec: gen snumcontrib = sum(numcontrib)
by provname atecosec: keep if _n == _N

The Stata Journal (2002)
2, Number 1, pp. 86–102
Speaking Stata: How to move step by: step
Nicholas J. Cox

is a helpful reference.

On Sun, Jan 26, 2014 at 1:13 PM, Roberto Ferrer <[email protected]> wrote:
> You're right, -collapse- works:
>
> *----------- begin code --------------
>
> clear all
> set more off
>
> input ///
> str20 provname    provcode    str2 lic    str1 atecosec   str1
> atecosec2002    numcontrib
> AGRIGENTO              84                       AG           A
>         A                     100
> AGRIGENTO              84                       AG           A
>         B                      50
> AGRIGENTO              84                       AG           B
>         C                      12
> AGRIGENTO              84                       AG           C
>         D                      79
> AGRIGENTO              84                       AG           O
>         P                      34
> AGRIGENTO              84                       AG           P
>         Q                       0
> AGRIGENTO              84                       AG           Z
>         Z                       1
> ALESSANDRIA            6                        AL           A
>         A                      29
> ALESSANDRIA            6                        AL           A
>         B                      12
> ALESSANDRIA            6                        AL           B
>         C                       0
> ALESSANDRIA            6                        AL           C
>         D                       5
> end
>
> list, sepby(provname)
>
> collapse (sum) numcontrib, by(provname atecosec)
>
> list, sepby(provname)
>
> *------------------- end code ------------------------
>
> On Sun, Jan 26, 2014 at 11:06 AM, Marie-Luise Schmitz
> <[email protected]> wrote:
>> Dear Stata Users,
>>
>> I have a data set that looks like this:
>>
>> province_name    province_code_107    license_number    ateco_section    ateco_section2002    numero_contribuenti...
>> AGRIGENTO              84                       AG           A                 A                     100
>> AGRIGENTO              84                       AG           A                 B                      50
>> AGRIGENTO              84                       AG           B                 C                      12
>> AGRIGENTO              84                       AG           C                 D                      79
>> AGRIGENTO              84                       AG           O                 P                      34
>> AGRIGENTO              84                       AG           P                 Q                       0
>> AGRIGENTO              84                       AG           Z                 Z                       1
>> ALESSANDRIA            6                        AL           A                 A                      29
>> ALESSANDRIA            6                        AL           A                 B                      12
>> ALESSANDRIA            6                        AL           B                 C                       0
>> ALESSANDRIA            6                        AL           C                 D                       5
>>
>> It contains numerous numeric variables following the variable numero_contribuenti.
>> The variable ateco_section is a redefined version of the variable ateco_section2002 and shows sectors of economic activity. For instance, A = agriculture, B = fishery, etc.
>> In the redefined variable ateco_section, sectors A and B are summarzied by A.
>> However, the problem is that I want only one entry for sector A for each province that is, for numeric variables as numero_contribuenti I want the sum of previous A and B, hence:
>>
>> province_name    province_code_107    license_number    ateco_section       numero_contribuenti     .........
>> AGRIGENTO               84                       AG         A                         150
>> AGRIGENTO               84                       AG         B                          12
>>
>>
>> I want to apply that to each province.
>> I guess this problem may be solved with collapse (sum) but I am totally lost.
>> Any help is highly appreciated.
>>
>> Marie-Luise
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index