Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: simple way to create missing data that is "missing atrandom" from a small datset


From   Suzy <[email protected]>
To   [email protected]
Subject   Re: st: Re: simple way to create missing data that is "missing atrandom" from a small datset
Date   Fri, 24 Feb 2006 19:27:20 -0500

Thank you Maarten. What I also did is dichotomize bmi missingness - (generated newvar bmicat = 1 missing ; 0 otherwise). I then ran a logistic regressions with bmicat as the binary response variable univariately (age alone, sex alone, race alone, etc...) and then with the full model. In each case, the odds of BMI missingness was significantly associated with age, but not with any other variables. Age was even associated with bmicat in the full model after accounting for the other variables). I heard that this is an approach that can be used to assess MCAR vs. MAR. Do you agree?

tab bmicat

bmimi | Freq. Percent Cum.
------------+-----------------------------------
0 | 305 91.87 91.87
1 | 27 8.13 100.00
------------+-----------------------------------
Total | 332 100.00


logistic bmicat age sex fhdm dmcat race

Logistic regression Number of obs = 332
LR chi2(5) = 37.96
Prob > chi2 = 0.0000
Log likelihood = -74.639705 Pseudo R2 = 0.2028

------------------------------------------------------------------------------
bmicat | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | 1.121633 .0268219 4.80 0.000 1.070276 1.175454
sex | .9201524 .4542155 -0.17 0.866 .3496878 2.421247
fhdm | 1.060376 .5558315 0.11 0.911 .3795549 2.962413
dmcat | .7724482 .4741646 -0.42 0.674 .2319329 2.572625
race | 1.340202 .791231 0.50 0.620 .4213434 4.262891
------------------------------------------------------------------------------



Maarten buis wrote:


Suzy:
You wanted to create missingness according the to a MAR process, in your case the probability of
missingness in the variable bmi should depend on the variable age. So we created the probability
of missingness for each observation. The youngest person in your dataset has a probablity of
missingness of invlogit(-8 + .1*28) = .0054863 (type -di invlogit(-8 + .1*28)-) and the oldest
person has a missingness of invlogit(-8 + .1*82) = .549834. If the probability of missingness was
constant (or random and unrelated to any of the other variables) than the missingness mechanism
would be missing completely at random MCAR.

HTH,
Maarten

--- Suzy <[email protected]> wrote:


I'm not sure what the implications are of the std dev and the max values of p (.549).

-----------------------------------------
between 1/2/2006 and 31/3/2006 I will be
visiting the UCLA, during this time the
best way to reach me is by email

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------





___________________________________________________________ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/





*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index