Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: Missing values [was: RE: st: simple question]


From   Joseph Coveney <jcoveney@bigplanet.com>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   Re: Missing values [was: RE: st: simple question]
Date   Mon, 07 Jun 2004 12:45:31 +0900

Nick Cox wrote:

>For consistency with what?

For consistency (in the context of data management) between the way in which
Stata treats missing values when handling them as continuous variables and
the way in which Stata treats missing values when handling them as
categorical variables.  When handling data as continuous, Stata treats
missing as highest-possible, but when handling as categorical, Stata treats
missing as missing.  A quick example:  -tabulate , generate()-
and -tabulate, missing generate()- give similar results in that values in
dummy variables for records with missing-value categories (. through .z) are
set to missing (.).

Now that Nick mentions it, -tabulate-'s case is not so clear-cut in that it
does serve both data management and statistical purposes, which probably
explains its choice of default behavior.  And the help file for -tabulate-
is explicit as to what its -missing- option does.  But perhaps it would have
been better if -tabulate, missing generate()- behaved more like -egen =
group(), missing-, since the context is clearly data management.

Nick's other points are well taken, too, and I wasn't trying to hold SAS up
as an exemplar--Stata's choices for default behavior are agreeable.

Joseph Coveney




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index