Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: basic qiestion


From   Eric Booth <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: Re: basic qiestion
Date   Tue, 23 Aug 2011 02:02:53 +0000

<>
I got this reply from Nadine off-list--

On Aug 22, 2011, at 8:40 PM, Nadine Brooks wrote:

> Thanks Phil and Eric but even with egen I can not solve my problem.
> 
> I am working with a survey data with 410,241 individual from all ages.
> Some of them work and other not. Some the variables that i wnat to sum
> is:
> 
> v9532: income from main job
> v9982: income from secondary job
> v1022: income from the third or more jobs
> 
> so only 170,014 indivuduals work, so when I use  egen
> sal=rowtotal(v9535 v9982 v1022) I will have people with income equal
> zero...
> 
> Take a look:
> 
> sum v9532 v9982 v1022
> 
>   Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      v9532 |    170014    831.5625    1451.442          3     120000
>      v9982 |      8326    686.3957    1179.807          1      48000
>      v1022 |       672      957.75    1422.576          8      11000
> 
> egen sal=rowtotal(v9535 v9982 v1022)
> 
> . sum v9532 v9982 v1022 sal
> 
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       v9532 |    170014    831.5625    1451.442          3     120000
>       v9982 |      8326    686.3957    1179.807          1      48000
>       v1022 |       672      957.75    1422.576            8      11000
>          sal |    410241    15.88779    225.3688          0      48000
> 
> Now I have all the individuals in my survey data with some income,
> even zero. But I dont want that.
> 

The zeros are from observations where all three v* variables are missing.  
The help file entry for -egen, rowtotal()- says:

 ...  It creates the (row) sum of the variables in varlist, treating missing as 0.  If missing is
            specified and all values in varlist are missing for an observation, newvar is set to missing.


So, you can change your code 
gen sal=rowtotal(v9535 v9982 v1022)

to

gen sal=rowtotal(v9535 v9982 v1022), missing

...
> Now I have all the individuals in my survey data with some income,
> even zero. But I dont want that.
> 
> After your advice I had tried also:  egen sal=rowtotal(v9535 v9982
> v1022) if v9535>0
> because who has the 2nd and or 3th job must have the first (main). But
> it did not work as well

Note that the reason that:

egen sal=rowtotal(v9535 v9982 v1022) if v9535>0

won't work as you intend is that missing values (.) are also greater than 0 (see help missing), so "if v9535>0" will evaluate to true when v9535 is missing, even though you expect that it would evaluate to false.


Returning to my example, you would run:
****
clear
input v9535  v9102  v1022 
3 4 5
5 . 6
9 . .
. . . 
1 1 1
end

egen sal3 = rowtotal(v9535 v9102 v1022), missing
list
*****
- Eric


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index