Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Regressing with variables with missing values


From   Ramani Gunatilaka <ramani.gunatilaka@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Regressing with variables with missing values
Date   Mon, 7 Nov 2005 09:38:11 +1100

Hi all,
I have been following up on all the useful comments I got and have
been working on that ice thing to replace missing values.
Unfortunately the programme goes through the motions but doesn't
replace any missing values. I am at my wit's end. The dependent
variable and the one that has missing values is happy (which takes the
values 1-5 depending on level of happiness (the data set as a whole
has 6805 observations), and my code runs like this.

use uphvar02, clear

. ice happy ln_pcy02 r_health male divorced widowed using uricevar02,
cmd(regress) eq(happy: ln_pcy02 r _health male divorced widowed)
genmiss(M1) id(flag1) replace

This is my output:

    Variable | Command     | Prediction equation
-------------+-------------+--------------------------------------------------
       happy | regress     | ln_pcy02 r_health male divorced widowed
    ln_pcy02 | regress     | [No missing data in estimation sample]
    r_health | regress     | [No missing data in estimation sample]
        male | regress     | [No missing data in estimation sample]
    divorced | regress     | [No missing data in estimation sample]
     widowed | regress     | [No missing data in estimation sample]

Imputing
[Only 1 variable to be imputed, therefore no cycling needed.]
1..file uricevar02.dta saved

. sort city province hhid

. compress

. save uricevar02, replace
file uricevar02.dta saved.
end of do-file

But when I check - here's what I get. Missing values still there.

. count if happy==.
   65

Does anybody have any ideas as to what might be going wrong?
Thanks so much,
Ramani


On 03/11/05, Garrard, Wendy M. <wendy.garrard@vanderbilt.edu> wrote:
> Ramani,
> The MAR assumption is pretty robust to some violations. The main issue
> for MAR is whether you have some observed covariates that provide
> information about the missing values.  For example, if household income
> is missing, then other variables, if observed, may provide some basis
> for (e.g., zip code, occupation, education level) plausible estimation.
>
> If you have some good covariates you may be able to construct a
> relatively simple regression model to come up with some plausible
> estimates of the missing values.  Note -- if you have good covariates
> multiple imputation is also an option.  If you don't have observed
> covariate information, and the missing data is non-random (MNAR), then
> more specialized (and probably complex) models are required for handling
> the missing data.
>
> If you can justify MAR, the -impute- command may help you, although the
> multiple imputation algorithms are more cutting edge these days.
>
> Cheers,
> wg
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ramani
> Gunatilaka
> Sent: Wednesday, November 02, 2005 2:50 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: Regressing with variables with missing values
>
> Thanks, Paul. I did download listmiss and use it. Now my dilemma is that
> the main culprits appear non-random wrt the dependent variable according
> to listmiss (ie. t and p values appear in yellow with stars). That means
> that I can't use ice because that assumes that the missing observations
> are missing at random. I'd be grateful for any suggestions as to what I
> should do next.
> Ramani
>
> On 03/11/05, Paul Millar <paul.millar@shaw.ca> wrote:
> > You might also use the post-estimation command - listmiss - to find
> > which variables are the main culprits and which ones have missing
> > values that are non-random wrt the dependent variable.
> > ssc install listmiss
> >
> > - Paul Millar
> >
> > At 09:18 AM 02/11/2005, you wrote:
> > >At 10:52 AM 11/2/2005, Ramani Gunatilaka wrote:
> > >>Dear Statalist,
> > >>This may seem a stupid question for the statisticians among you but
> > >>I'd appreciate some help.
> > >>I want to run a regression on cross-section data with lots of
> > >>variables, some of which have missing values. When I do that, Stata
> > >>estimates the model using only the observations which have values
> > >>for all variables. I downloaded tabmiss and rmiss2 as in the relvant
>
> > >>FAQ and the commands would certainly help in enabling me to decide
> > >>which variables to drop. But is there any way that I could retain
> > >>all the variables with their missing values and make allowance for
> > >>the missing values by including a dummy for missing variables?
> > >
> > >The way you retain the missing values is by recoding them to a
> > >non-missing value, e.g. the variable's mean.  This has all sorts of
> > >problems though.  The MD dummy variable indicator that you propose
> > >used to be popular but has since been discredited.  See Paul
> > >Allison's Sage book "Missing Data."
> > >
> > >For a synopsis of basic strategies and their pros and cons, see
> > >
> > >http://www.nd.edu/~rwilliam/stats2/l12.pdf
> > >
> > >That handout is weak in discussing more advanced methods, although it
>
> > >does allude to them.  You might check out Royston's -ice- package,
> > >which was recently updated and discussed in the Stata Journal.  Use
> > >
> > >-findit ice-
> > >
> > >
> > >-------------------------------------------
> > >Richard Williams, Notre Dame Dept of Sociology
> > >OFFICE: (574)631-6668, (574)631-6463
> > >FAX:    (574)288-4373
> > >HOME:   (574)289-5227
> > >EMAIL:  Richard.A.Williams.5@ND.Edu
> > >WWW (personal):    http://www.nd.edu/~rwilliam
> > >WWW (department):    http://www.nd.edu/~soc
> > >*
> > >*   For searches and help try:
> > >*   http://www.stata.com/support/faqs/res/findit.html
> > >*   http://www.stata.com/support/statalist/faq
> > >*   http://www.ats.ucla.edu/stat/stata/
> >
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index