Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: basic question


From   Phil Clayton <[email protected]>
To   [email protected]
Subject   Re: st: basic question
Date   Tue, 23 Aug 2011 12:11:00 +1000

Nadine replied privately off-list; this is generally discouraged in the Statalist FAQ so I am re-posting her message below.

I'm a little unclear on what you actually want. If you want people with no income in any of the 3 variables to have a missing value for sal, the easiest option would be to add the -missing- option to the -egen- command:
egen sal=rowtotal(v9535 v9982 v1022), missing

egen sal=rowtotal(v9535 v9982 v1022) if v9535>0
will not do what you want because, in Stata, missing is the highest number you can have. So missing is greater than zero. As an alternative you could try:
egen sal=rowtotal(v9535 v9982 v1022) if v9535>0 & v9535<.
or
egen sal=rowtotal(v9535 v9982 v1022) if v9535>0 & !missing(v9535)

Neither of these solutions are as good as -egen sal=rowtotal(v9535 v9982 v1022), missing- because they assume that v9982 and v1022 will definitely be missing if v9535 is missing. In a perfect world your dataset would be clean enough that this would always be true, but in real life this is not always the case so it's safer to assume that there may be income recorded in v9982 and/or v1022 even if v9535 is missing.

Incidentally, why not rename the variables something more readable such as salary, income1, income2 and income3?

Phil

On 23/08/2011, at 11:40 AM, Nadine Brooks wrote:

> Thanks Phil and Eric but even with egen I can not solve my problem.
> 
> I am working with a survey data with 410,241 individual from all ages.
> Some of them work and other not. Some the variables that i wnat to sum
> is:
> 
> v9532: income from main job
> v9982: income from secondary job
> v1022: income from the third or more jobs
> 
> so only 170,014 indivuduals work, so when I use  egen
> sal=rowtotal(v9535 v9982 v1022) I will have people with income equal
> zero...
> 
> Take a look:
> 
> sum v9532 v9982 v1022
> 
>   Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      v9532 |    170014    831.5625    1451.442          3     120000
>      v9982 |      8326    686.3957    1179.807          1      48000
>      v1022 |       672      957.75    1422.576          8      11000
> 
> egen sal=rowtotal(v9535 v9982 v1022)
> 
> . sum v9532 v9982 v1022 sal
> 
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       v9532 |    170014    831.5625    1451.442          3     120000
>       v9982 |      8326    686.3957    1179.807          1      48000
>       v1022 |       672      957.75    1422.576            8      11000
>          sal |    410241    15.88779    225.3688          0      48000
> 
> Now I have all the individuals in my survey data with some income,
> even zero. But I dont want that.
> 
> After your advice I had tried also:  egen sal=rowtotal(v9535 v9982
> v1022) if v9535>0
> because who has the 2nd and or 3th job must have the first (main). But
> it did not work as well
> 
> Thanks, Nadine


On 23/08/2011, at 11:46 AM, Eric Booth wrote:

> <>
> Nadine:
> 
> You received several, similar answers about your issue.  In addition to all these and the help files for -sum()- and -egen-, take a look at Nick Cox's 2002  article "Speaking Stata: On getting functions to do the work." Stata Journal 2: 411–427. (Free due to SJ's moving pay wall at:  http://www.stata-journal.com/sjpdf.html?articlenum=pr0007 ) 
> 
> - Eric
> On Aug 22, 2011, at 8:24 PM, Nadine Brooks wrote:
> 
>> But it is not what is happening. Take a look:
>> 
>> sum v9532 v9982 v1022
>> 
>>   Variable |       Obs        Mean    Std. Dev.       Min        Max
>> -------------+--------------------------------------------------------
>>      v9532 |    170014    831.5625    1451.442          3     120000
>>      v9982 |      8326    686.3957    1179.807          1      48000
>>      v1022 |       672      957.75    1422.576          8      11000
>> 
>> . gen sal= (v9532+v9982+v1022)
>> (409603 missing values generated)
>> 
>> . sum v9532 v9982 v1022 sal
>> 
>>   Variable |       Obs        Mean    Std. Dev.       Min        Max
>> -------------+--------------------------------------------------------
>>      v9532 |    170014    831.5625    1451.442          3     120000
>>      v9982 |      8326    686.3957    1179.807          1      48000
>>      v1022 |       672      957.75    1422.576          8      11000
>>          sal |       638    3999.621    4536.377         68      40000
>> 
>> 
>> My variables meas:
>> 
>> v9532: income from main job
>> v9982: income from secondary job
>> v1022: third or more of income jobs
>> 
>> But most of the people have only one job, so they get missing to v9982
>> and v1022...
>> 
>> Thanks, Nadine
>> 
>> 
>> 2011/8/22 Daniel Marcelino <[email protected]>:
>>> Well, I don't know exactly what yours variables means.
>>> If you have numeric values, you result should be:
>>> v1 v2 v3  sal
>>> 1   2   3    6
>>> 1   .    1    2
>>> 
>>> 
>>> Daniel
>>> 
>>> On Mon, Aug 22, 2011 at 9:11 PM, Nadine Brooks <[email protected]> wrote:
>>>> By there are missing values particularlly in v9102 and v1022, so I
>>>> think that I can not use the operator +, can I?
>>>> 
>>>> 
>>>> 
>>>> 2011/8/22 Daniel Marcelino <[email protected]>:
>>>>> try
>>>>> 
>>>>> gen sal= (v9535 + v9102 + v1022)
>>>>> 
>>>>> Daniel
>>>>> 
>>>>> 
>>>>> On Mon, Aug 22, 2011 at 8:54 PM, Nadine Brooks <[email protected]> wrote:
>>>>>> Hi statalist
>>>>>> 
>>>>>> I am a beginner Stata user and I am having trouble to generate a new
>>>>>> variable. I am using:
>>>>>> gen sal=sum (v9535,v9102,v1022)
>>>>>> and I am getting: v9535,v9102,v1022 invalid name
>>>>>> r(198);
>>>>>> 
>>>>>> But the names of all the variables are correct, so what I am doing wrong?
>>>>>> 
>>>>>> Thanks
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> About me: http://danielmarcelino.zip.net/
>>>>> 
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> 
>>>> 
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> About me: http://danielmarcelino.zip.net/
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index