Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Missing data on outcome and sample selection bias

From	Rosie Chen <[email protected]>
To	[email protected]
Subject	st: Missing data on outcome and sample selection bias
Date	Mon, 1 Mar 2010 07:03:23 -0800 (PST)

Carlo, thanks for your response. My question is not related to right censoring or independent variables' missing cases. It is the fact that respondents did not answer the question for the outcome variable. We can't impute outcome values, so that's why we often have to delete cases that have missing values on the dependent variable. But there is a potential sample selection bias. 

So dear all,  here are my several questions regarding a multilevel analysis with missing values on the outcome variable:

1)     Do we often compare the deleted cases with the
final raw sample without missing data imputation or with the final
sample with missing cases imputed? 
(2) To what extent do t-tests can be useful for determining sample
selection bias? What criterion do we use? Do the significant t tests on
all predictors indicate such a problem or half of the tests being
significant indicates the problem?
(3)     If t-test is not a very good tool to assess the problem, should we use Heckman method? Can we use Heckman test to detect and remedy the possible sample selection bias problem with a dependent variable in Stata? I learned that there is a Heckman and a GLLMM syntax in Stata, but I am
not sure if it can incorporate all three features (multilevel data structure,
multiple-imputed data, and complex survey design) into consideration.

Your advice would be appreciated very much,

Rosie



----- Original Message ----
From: Carlo Lazzaro <[email protected]>
To: [email protected]
Cc: Rosie Chen <[email protected]>
Sent: Mon, March 1, 2010 1:58:39 AM
Subject: R: Missing data analysis



Dear Rosie,
I am not clear about what you mean with "we have to to delete cases that
have missing values", since this is not the standard practice.

If you mean (right)censored observations, they can be addressed in Stata via
Survival Analysis suite (please, see -stset- and related stuff in Stata
9.2/SE).

For more details on dealing with missing observations, especially when
they're variables rather than outcomes, you might want to take a look at:

Little RJA, Rubin DB. Statistical analysis with missing data. Second
Edition. Hoboken, NJ: Wiley, 2002.

HTH and Kind Regards,

Carlo 

-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Rosie Chen
Inviato: domenica 28 febbraio 2010 21.31
A: [email protected]
Oggetto: st: Missing data analysis

Hi, dear listserv members,

   I have a question that is not specifically related to Stata, but would
like to have a try in here: 

   In most studies, we have to delete cases that have missing values on the
outcome variable. The issue is whether the deleted cases are significantly
different from the final sample we use, because of the potential sample
selection bias problem.  My question is: do we often compare the deleted
cases with the final raw sample without missing data imputation or with the
final sample with missing cases imputed? Any suggestions are appreciated
very much,

  Rosie



      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Missing data on outcome and sample selection bias
  - From: "Lachenbruch, Peter" <[email protected]>

References:
- st: R: Missing data analysis
  - From: "Carlo Lazzaro" <[email protected]>

Prev by Date: RE: st: lags with multiply imputed panel data
Next by Date: st: AW: AW: testing a model
Previous by thread: st: R: Missing data analysis
Next by thread: st: RE: Missing data on outcome and sample selection bias
Index(es):
- Date
- Thread