Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Aw: Re: Re: Re: st: sum over variables for determinate observations


From   "Marie-Luise Schmitz" <[email protected]>
To   [email protected]
Subject   Aw: Re: Re: Re: st: sum over variables for determinate observations
Date   Wed, 29 Jan 2014 11:37:24 +0100 (CET)

Dear Robert,
 
actually I think I succeded in solving the problem of missings becoming zeroes through collapse but a confirmation would encourage me. 
 
For each of the variables going to be affected by collapse I created a dummy taking the value 1 when the numerical counterpart is a missing. I then applied collapse knowing that values of the dummy equal to 2 indicate missings of the summarized original variable. In a final step, I replace the collapsed numerical variables back to missings if their dummy counterpart is equal to 2.
 
---- begin code -----
order  province_name ateco_section, first
foreach var of varlist numero_contribuenti_2005-ricercatori_med_2006 {
generate d_`var'=1 if `var'==.a
}
sort province_name ateco_section
collapse(sum) numero_contribuenti_2005-d_ricercatori_med_2006, by(province_name province_code_107 license_number ateco_section)
foreach var of varlist numero_contribuenti_2005-ricercatori_med_2006{
replace `var'=.a if d_`var'==2
}
drop d_*
---- end code -----


Many thanks
Marie-Luise
 
 
 

Gesendet: Dienstag, 28. Januar 2014 um 19:12 Uhr
Von: "Roberto Ferrer" <[email protected]>
An: "Stata Help" <[email protected]>
Betreff: Re: Re: Re: st: sum over variables for determinate observations
Marie-Luise,

Maybe -mvencode- and -mvdecode- can help. Check the corresponding help files.
Below one example:

*---------------- begin code -----------------

clear all
set more off

sysuse auto
keep price mpg
replace price = .

list

mvencode price, mv(0)
collapse (sum) price mpg
mvdecode price, mv(0)

list

*----------- end code -----------------------

On Mon, Jan 27, 2014 at 8:34 AM, Marie-Luise Schmitz <[email protected]> wrote:
> Yes Nick, you are right, I badly explained myself, sorry.
>
> I used:
> -----------------------
>
> sort province_name ateco_section
> collapse(sum) numero_contribuenti_2005-ricercatori_med_2006, by(province_name province_code_107 license_number ateco_section ateco_section_description)
>
> -----------------------
> where 'numero_contribuenti_2005' is the first and 'ricercatori_med_2006' the last numerical variable in the data set.
>
> The only remaining problem is that missings I defined as .a appear as zeroes in the collapsed data although it would be desirable to keep them defined as missings.
>
>
>
>
> Gesendet: Montag, 27. Januar 2014 um 13:19 Uhr
> Von: "Nick Cox" <[email protected]>
> An: "[email protected]" <[email protected]>
> Betreff: Re: Re: st: sum over variables for determinate observations
> What you mean by "did not work" is not explained here, but once you
> -keep- just one observation for each group, scope for accurate
> calculations of totals of any other variable is lost.
>
> -collapse- is, it seems, what you need here, obviating the need for a
> loop. It was suggested earlier in this thread, and it's not clear why
> you are not using it.
>
> Nick
> [email protected]
>
> On 27 January 2014 12:12, Marie-Luise Schmitz <[email protected]> wrote:
>> Dear Roberto,
>>
>> thank you for your suggestion. I used:
>>
>> bysort province_name ateco_section: egen numero_contribuenti_2005_test = total(numero_contribuenti_2005)
>> by province_name ateco_section: keep if _n == 1
>> replace numero_contribuenti_2005_test=.a if numero_contribuenti_2005==.a
>>
>> to do the task for one variable and it perfectly worked out. But the data set contains 93 numeric variables. I tried to do a foreach loop but this did not work. Any suggestion how to do this for many variables?
>>
>>
>>
>> Gesendet: Sonntag, 26. Januar 2014 um 19:01 Uhr
>> Von: "Roberto Ferrer" <[email protected]>
>> An: "Stata Help" <[email protected]>
>> Betreff: Re: st: sum over variables for determinate observations
>> Alternatives are:
>>
>> /*
>> Use -egen, total()-, to compute totals and keep an arbitrary observation
>> (here the first one).
>> */
>>
>> bysort provname atecosec: egen snumcontrib = total(numcontrib)
>> by provname atecosec: keep if _n == 1
>>
>>
>> /*
>> Use -sum- to compute a cumulative sum and keep the last observation
>> */
>>
>> bysort provname atecosec: gen snumcontrib = sum(numcontrib)
>> by provname atecosec: keep if _n == _N
>>
>> The Stata Journal (2002)
>> 2, Number 1, pp. 86-102
>> Speaking Stata: How to move step by: step
>> Nicholas J. Cox
>>
>> is a helpful reference.
>>
>> On Sun, Jan 26, 2014 at 1:13 PM, Roberto Ferrer <[email protected]> wrote:
>>> You're right, -collapse- works:
>>>
>>> *----------- begin code --------------
>>>
>>> clear all
>>> set more off
>>>
>>> input ///
>>> str20 provname provcode str2 lic str1 atecosec str1
>>> atecosec2002 numcontrib
>>> AGRIGENTO 84 AG A
>>> A 100
>>> AGRIGENTO 84 AG A
>>> B 50
>>> AGRIGENTO 84 AG B
>>> C 12
>>> AGRIGENTO 84 AG C
>>> D 79
>>> AGRIGENTO 84 AG O
>>> P 34
>>> AGRIGENTO 84 AG P
>>> Q 0
>>> AGRIGENTO 84 AG Z
>>> Z 1
>>> ALESSANDRIA 6 AL A
>>> A 29
>>> ALESSANDRIA 6 AL A
>>> B 12
>>> ALESSANDRIA 6 AL B
>>> C 0
>>> ALESSANDRIA 6 AL C
>>> D 5
>>> end
>>>
>>> list, sepby(provname)
>>>
>>> collapse (sum) numcontrib, by(provname atecosec)
>>>
>>> list, sepby(provname)
>>>
>>> *------------------- end code ------------------------
>>>
>>> On Sun, Jan 26, 2014 at 11:06 AM, Marie-Luise Schmitz
>>> <[email protected]> wrote:
>>>> Dear Stata Users,
>>>>
>>>> I have a data set that looks like this:
>>>>
>>>> province_name province_code_107 license_number ateco_section ateco_section2002 numero_contribuenti...
>>>> AGRIGENTO 84 AG A A 100
>>>> AGRIGENTO 84 AG A B 50
>>>> AGRIGENTO 84 AG B C 12
>>>> AGRIGENTO 84 AG C D 79
>>>> AGRIGENTO 84 AG O P 34
>>>> AGRIGENTO 84 AG P Q 0
>>>> AGRIGENTO 84 AG Z Z 1
>>>> ALESSANDRIA 6 AL A A 29
>>>> ALESSANDRIA 6 AL A B 12
>>>> ALESSANDRIA 6 AL B C 0
>>>> ALESSANDRIA 6 AL C D 5
>>>>
>>>> It contains numerous numeric variables following the variable numero_contribuenti.
>>>> The variable ateco_section is a redefined version of the variable ateco_section2002 and shows sectors of economic activity. For instance, A = agriculture, B = fishery, etc.
>>>> In the redefined variable ateco_section, sectors A and B are summarzied by A.
>>>> However, the problem is that I want only one entry for sector A for each province that is, for numeric variables as numero_contribuenti I want the sum of previous A and B, hence:
>>>>
>>>> province_name province_code_107 license_number ateco_section numero_contribuenti .........
>>>> AGRIGENTO 84 AG A 150
>>>> AGRIGENTO 84 AG B 12
>>>>
>>>>
>>>> I want to apply that to each province.
>>>> I guess this problem may be solved with collapse (sum) but I am totally lost.
>>>> Any help is highly appreciated.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/[http://www.stata.com/support/faqs/resources/statalist-faq/][http://www.stata.com/support/faqs/resources/statalist-faq/[http://www.stata.com/support/faqs/resources/statalist-faq/]]
> * http://www.ats.ucla.edu/stat/stata/[http://www.ats.ucla.edu/stat/stata/][http://www.ats.ucla.edu/stat/stata/[http://www.ats.ucla.edu/stat/stata/]]
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search[http://www.stata.com/help.cgi?search]
> * http://www.stata.com/support/faqs/resources/statalist-faq/[http://www.stata.com/support/faqs/resources/statalist-faq/]
> * http://www.ats.ucla.edu/stat/stata/[http://www.ats.ucla.edu/stat/stata/]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search[http://www.stata.com/help.cgi?search]
* http://www.stata.com/support/faqs/resources/statalist-faq/[http://www.stata.com/support/faqs/resources/statalist-faq/]
* http://www.ats.ucla.edu/stat/stata/[http://www.ats.ucla.edu/stat/stata/]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index