Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Phil Clayton <philclayton@internode.on.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: basic question |

Date |
Tue, 23 Aug 2011 12:11:00 +1000 |

Nadine replied privately off-list; this is generally discouraged in the Statalist FAQ so I am re-posting her message below. I'm a little unclear on what you actually want. If you want people with no income in any of the 3 variables to have a missing value for sal, the easiest option would be to add the -missing- option to the -egen- command: egen sal=rowtotal(v9535 v9982 v1022), missing egen sal=rowtotal(v9535 v9982 v1022) if v9535>0 will not do what you want because, in Stata, missing is the highest number you can have. So missing is greater than zero. As an alternative you could try: egen sal=rowtotal(v9535 v9982 v1022) if v9535>0 & v9535<. or egen sal=rowtotal(v9535 v9982 v1022) if v9535>0 & !missing(v9535) Neither of these solutions are as good as -egen sal=rowtotal(v9535 v9982 v1022), missing- because they assume that v9982 and v1022 will definitely be missing if v9535 is missing. In a perfect world your dataset would be clean enough that this would always be true, but in real life this is not always the case so it's safer to assume that there may be income recorded in v9982 and/or v1022 even if v9535 is missing. Incidentally, why not rename the variables something more readable such as salary, income1, income2 and income3? Phil On 23/08/2011, at 11:40 AM, Nadine Brooks wrote: > Thanks Phil and Eric but even with egen I can not solve my problem. > > I am working with a survey data with 410,241 individual from all ages. > Some of them work and other not. Some the variables that i wnat to sum > is: > > v9532: income from main job > v9982: income from secondary job > v1022: income from the third or more jobs > > so only 170,014 indivuduals work, so when I use egen > sal=rowtotal(v9535 v9982 v1022) I will have people with income equal > zero... > > Take a look: > > sum v9532 v9982 v1022 > > Variable | Obs Mean Std. Dev. Min Max > -------------+-------------------------------------------------------- > v9532 | 170014 831.5625 1451.442 3 120000 > v9982 | 8326 686.3957 1179.807 1 48000 > v1022 | 672 957.75 1422.576 8 11000 > > egen sal=rowtotal(v9535 v9982 v1022) > > . sum v9532 v9982 v1022 sal > > Variable | Obs Mean Std. Dev. Min Max > -------------+-------------------------------------------------------- > v9532 | 170014 831.5625 1451.442 3 120000 > v9982 | 8326 686.3957 1179.807 1 48000 > v1022 | 672 957.75 1422.576 8 11000 > sal | 410241 15.88779 225.3688 0 48000 > > Now I have all the individuals in my survey data with some income, > even zero. But I dont want that. > > After your advice I had tried also: egen sal=rowtotal(v9535 v9982 > v1022) if v9535>0 > because who has the 2nd and or 3th job must have the first (main). But > it did not work as well > > Thanks, Nadine On 23/08/2011, at 11:46 AM, Eric Booth wrote: > <> > Nadine: > > You received several, similar answers about your issue. In addition to all these and the help files for -sum()- and -egen-, take a look at Nick Cox's 2002 article "Speaking Stata: On getting functions to do the work." Stata Journal 2: 411–427. (Free due to SJ's moving pay wall at: http://www.stata-journal.com/sjpdf.html?articlenum=pr0007 ) > > - Eric > On Aug 22, 2011, at 8:24 PM, Nadine Brooks wrote: > >> But it is not what is happening. Take a look: >> >> sum v9532 v9982 v1022 >> >> Variable | Obs Mean Std. Dev. Min Max >> -------------+-------------------------------------------------------- >> v9532 | 170014 831.5625 1451.442 3 120000 >> v9982 | 8326 686.3957 1179.807 1 48000 >> v1022 | 672 957.75 1422.576 8 11000 >> >> . gen sal= (v9532+v9982+v1022) >> (409603 missing values generated) >> >> . sum v9532 v9982 v1022 sal >> >> Variable | Obs Mean Std. Dev. Min Max >> -------------+-------------------------------------------------------- >> v9532 | 170014 831.5625 1451.442 3 120000 >> v9982 | 8326 686.3957 1179.807 1 48000 >> v1022 | 672 957.75 1422.576 8 11000 >> sal | 638 3999.621 4536.377 68 40000 >> >> >> My variables meas: >> >> v9532: income from main job >> v9982: income from secondary job >> v1022: third or more of income jobs >> >> But most of the people have only one job, so they get missing to v9982 >> and v1022... >> >> Thanks, Nadine >> >> >> 2011/8/22 Daniel Marcelino <dmsilv@gmail.com>: >>> Well, I don't know exactly what yours variables means. >>> If you have numeric values, you result should be: >>> v1 v2 v3 sal >>> 1 2 3 6 >>> 1 . 1 2 >>> >>> >>> Daniel >>> >>> On Mon, Aug 22, 2011 at 9:11 PM, Nadine Brooks <nb.statalist@gmail.com> wrote: >>>> By there are missing values particularlly in v9102 and v1022, so I >>>> think that I can not use the operator +, can I? >>>> >>>> >>>> >>>> 2011/8/22 Daniel Marcelino <dmsilv@gmail.com>: >>>>> try >>>>> >>>>> gen sal= (v9535 + v9102 + v1022) >>>>> >>>>> Daniel >>>>> >>>>> >>>>> On Mon, Aug 22, 2011 at 8:54 PM, Nadine Brooks <nb.statalist@gmail.com> wrote: >>>>>> Hi statalist >>>>>> >>>>>> I am a beginner Stata user and I am having trouble to generate a new >>>>>> variable. I am using: >>>>>> gen sal=sum (v9535,v9102,v1022) >>>>>> and I am getting: v9535,v9102,v1022 invalid name >>>>>> r(198); >>>>>> >>>>>> But the names of all the variables are correct, so what I am doing wrong? >>>>>> >>>>>> Thanks >>>>>> * >>>>>> * For searches and help try: >>>>>> * http://www.stata.com/help.cgi?search >>>>>> * http://www.stata.com/support/statalist/faq >>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> About me: http://danielmarcelino.zip.net/ >>>>> >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/statalist/faq >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>> >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/statalist/faq >>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>> >>> >>> >>> -- >>> About me: http://danielmarcelino.zip.net/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: basic question***From:*Nadine Brooks <nb.statalist@gmail.com>

**References**:**st: basic question***From:*Nadine Brooks <nb.statalist@gmail.com>

**Re: st: basic question***From:*Daniel Marcelino <dmsilv@gmail.com>

**Re: st: basic question***From:*Nadine Brooks <nb.statalist@gmail.com>

**Re: st: basic question***From:*Daniel Marcelino <dmsilv@gmail.com>

**Re: st: basic question***From:*Nadine Brooks <nb.statalist@gmail.com>

**Re: st: basic question***From:*Eric Booth <ebooth@ppri.tamu.edu>

- Prev by Date:
**st: Re: basic qiestion** - Next by Date:
**st: Conditional Logit model with dependent variable as frequency** - Previous by thread:
**Re: st: basic question** - Next by thread:
**Re: st: basic question** - Index(es):