Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: handling of MV in summary stats


From   Kit Baum <baum@bc.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: Re: handling of MV in summary stats
Date   Wed, 17 Dec 2003 09:18:46 -0500

On Dec 17, 2003, at 2:33 AM, David wrote:

Note that for commands that take several variables together, they generally
work on the set of observations having no missing values on all the
variables under consideration.
An important exception to this general rule involves the 'r...' egen functions, which perform spreadsheet-like operations across variables (and thus provide an alternative to doing something like
gen avg = (x1+x2+x3)/3
which will indeed be missing if any component is missing.
egen avg = rmean(x1 x2 x3)
will not behave the same way; it computes the mean from what is available, and the associated rmiss() will give you the effective divisors. The r... functions are very useful, but one must read the help carefully to know how each one will deal with MVs. (I once had a problem with egen rsd(), row standard deviation, which ignores MVs; it returned zero if _all_ arguments were missing. I agreed with Stata's developers that there was no variance across those variables, but argued successfully that the rule that a function of NaNs should return a NaN should trump the statistical logic!)

Kit

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index