Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Summing variables by row


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Summing variables by row
Date   Mon, 15 Sep 2008 17:46:12 +0100

Leonor, I presume, wants -egen, rowtotal()- to return a result of missing if (and I presume only if) all values in an observation are missing. (Look at the example.) 

Rachael Williams I think wants the same. 

Martin Weiss I think understood this as wanting a result of missing if any of the values being summed was missing. But "any" and "all" are different. 

Ashim Kapoor also contributed to the discussion, but I can't follow his comments. 

It is enough to think of what happens when two variables are being summed across an observation. No new principles arise with three or more variables. 

Leonor and Rachael want missing + missing to be returned as missing. 

A glance at the help shows that you can get this with -egen, rowtotal()- so long as you specify the -missing- option. 

This was added in Stata 10.1. If you can't see this in the help, you or haven't upgraded, or are on some previous version of Stata, or both. 

Martin, or anyone who wants -egen, rowtotal()- to return missing if _any_ value is missing, will not get that from -egen- in one swoop unless they program a new -egen- function. The most direct way to get it in general that I can imagine is 

gen rowtotal = 0 
foreach v of varlist <whatever> { 
	replace rowtotal = rowtotal + `v' 
} 

One missing in the varlist will be enough to ensure a missing result. 

Another way to get it is something like 

egen rowtotal = rowtotal(<varlist>) 
egen nmiss = rowmiss(<varlist>) 
replace rowtotal = . if nmiss 

which despite being one fewer line is massively more inefficient than the previous code. 

Of course, if you have a very few variables, just writing (e.g.) 

gen rowtotal = a + b 

is best of all. 

Nick 
n.j.cox@durham.ac.uk 

Leonor Saravia

Im trying to sum many variables by row using the -rowtotal -
function. The problem is that I need to distinguish when the resultant
sum is cero or a missing value. Do you know if theres another command
or function that may sum by varlist and mantain the missing values
when they are present?

Id like to generate a variable 'total' like this:

var1 var 2 var3......varN  total
.       0      .           .        0
1      .       .           .        1
.       .       .           .        .
0      1      .           1        2
.
.
.
1      0    0             1       2


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index