Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: AW: RE: RE: Delete missing

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: AW: RE: RE: Delete missing
Date	Sun, 9 May 2010 19:01:34 +0100

I think that's highly contentious. 

For one, it is easy to imagine instances in which that strategy
automatically leads to  many more indicators (dummies, if you will) than
directly available predictors and to a kind of model that would not
strike anybody in its target audience as scientifically interesting or
useful. 

Besides, modelling with predictor variables on the RHS is far from the
only kind of statistical analysis possible. 

Nick 
[email protected] 

Martin Weiss

Overall, -drop-ping is an inferior strategy to using a dummy for
inclusion
in the analysis:
http://www.stata.com/statalist/archive/2009-12/msg00511.html
The only reason not to go for the latter strategy is the fear that the
-if-
qualifier will be forgotten at some stage - which cannot happen after
the
-drop- command...

Nick Cox

Tony's comment seems a bit more severe than the facts warrant. 

If you have missings Stata will just ignore them, so -drop-ping them
from the dataset is not going to make much difference to that. 

My impression as its author is that many of the uses of -dropmiss- (SJ)
in particular and many of the reasons for this request arise from
innocuous missings. For example, spreadsheet people often leave blank
rows and/or columns in their worksheets just as ways of making their
data more readable. Import into Stata will usually take such rows and
columns literally but they have no content and are best -drop-ped
straight away. There is no statistical issue in those situations raised
by -drop-ping missings, as the missings do not correspond to potential
data even in principle.   

Where it gets more complicated is that some people are tempted to -drop-
variables and/or observations in which _any_ values are missing. That's
usually going to lead to loss of information. That may be Tony's main
point. 

Nick 
[email protected] 

Lachenbruch, Peter

Generally, this is a very bad idea.  You will get biased estimates of
any parameters you estimate unless the data is missing at random.  Check
the multiple imputation manual.  Also note that Stata is not capitalized
as you have done.

Patricia Yu [[email protected]]

I have a question about deleting missing data.
I would like to delete cases if they have missing values in any
variables in
my dataset.
How can I do in STATA to delete these cases with any missing data?
Could you please share STATA codes with me?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: AW: RE: AW: RE: RE: Delete missing
  - From: "Martin Weiss" <[email protected]>

References:
- st: Delete missing
  - From: Patricia Yu <[email protected]>
- st: RE: Delete missing
  - From: "Lachenbruch, Peter" <[email protected]>
- st: RE: RE: Delete missing
  - From: "Nick Cox" <[email protected]>
- st: AW: RE: RE: Delete missing
  - From: "Martin Weiss" <[email protected]>

Prev by Date: st: AW: RE: RE: Delete missing
Next by Date: st: RE: ADO file to graph using features of the default scheme
Previous by thread: st: AW: RE: RE: Delete missing
Next by thread: st: AW: RE: AW: RE: RE: Delete missing
Index(es):
- Date
- Thread