Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
A Loumiotis <antonis.loumiotis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Missing Observations. Do I need multiple Imputations? |

Date |
Wed, 22 Aug 2012 12:08:03 +0300 |

I agree with you and I think that's what I also said. Your composite variable is missing if at least one of it component variables (B C D E F) is missing. When none of the component variables are missing then your composite variable is not missing. On Wed, Aug 22, 2012 at 10:32 AM, Abekah Nkrumah <ankrumah@gmail.com> wrote: > Dear Antonis, > > Thank you very much for your reply. I want to understand your first > line were you saying my aggregate variable is missing entirely? In my > statement I said the composite index (A) which you refereed to as > aggregate variable is there but drops substantial amount of > observations. So it is not entirely missing > > Thanks very much > > Regards > > On Wed, Aug 22, 2012 at 7:44 AM, A Loumiotis > <antonis.loumiotis@gmail.com> wrote: >> Hi Gordon, >> >> Since your aggregate variable is missing when at least one component >> is missing I believe you would first need to multiple impute the >> missing observations of your dataset and then compute your aggregate >> variable. I don't see a problem with multiple imputing variables such >> as age or number of wifes. In addition, your results might change if >> your data are missing (conditionally) at random even if your non >> missing sample is large. >> >> Best, >> Antonis >> >> >> >> On Tue, Aug 21, 2012 at 7:18 PM, Abekah Nkrumah <ankrumah@gmail.com> wrote: >>> Dear Statalist, >>> >>> >>> I will want some advice on this rather long question. Variable A in >>> the table below is a composite index derived from the aggregation >>> variables B, C, D, E and F which are also sub-indices. A geometric >>> aggregation method was used. From the table I realise that the >>> observations on the composite index (A) drops significantly >>> >>> >>> Variable | Obs Mean Std. Dev. Min Max >>> -------------+-------------------------------------------------------- >>> A 69623 .4898275 .1575975 .0498657 .8980919 >>> B 187524 .524507 .2669241 1.80e-08 1 >>> C 221089 .6625131 .3732415 2.18e-08 1 >>> D 234680 .7486263 .3494941 -1.29e-08 1 >>> E 108437 .5253285 .0648927 -2.61e-08 1 >>> -------------+-------------------------------------------------------- >>> F 119261 .6829314 .2270192 -1.62e-08 1 >>> >>> >>> I then decided to do a missing data check for all the indices and the >>> results is below >>> >>> Variable | Missing Total Percent Missing >>> ----------------+----------------------------------------------- >>> A 166,075 235,698 70.46 >>> B 48,174 235,698 20.44 >>> C 14,609 235,698 6.20 >>> D 1,018 235,698 0.43 >>> E 127,261 235,698 53.99 >>> F 116,437 235,698 49.40 >>> ----------------+----------------------------------------------- >>> >>> >>> I then checked the percentage missing for all the individual variables >>> used in computing the the sub-indices especially B, C, E and F. The >>> results is as below >>> >>> >>> Variable | Missing Total Percent Missing >>> ----------------+----------------------------------------------- >>> B1 | 46,317 235,698 19.65 >>> B2 | 46,967 235,698 19.93 >>> B3 | 46,815 235,698 19.86 >>> B4 | 47,005 235,698 19.94 >>> C1 | 5,128 235,698 2.18 >>> C2 | 5,164 235,698 2.19 >>> C3 | 6,180 235,698 2.62 >>> C4 | 9,730 235,698 4.13 >>> C5 | 5,608 235,698 2.38 >>> D1 | 444 235,698 0.19 >>> D2 | 483 235,698 0.20 >>> D3 | 657 235,698 0.28 >>> E1 | 82,112 235,698 34.84 >>> E2 | 58,504 235,698 24.82 >>> E3 | 65,469 235,698 27.78 >>> E4| 81,349 235,698 34.51 >>> F1 | 214 235,698 0.09 >>> F2 | 63,503 235,698 26.94 >>> F3 | 86,512 235,698 36.70 >>> F4 | 674 235,698 0.29 >>> ----------------+----------------------------------------------- >>> >>> The results above suggest that the drop in the number of observations >>> for the composite empowerment variable is due to the high level of >>> missing values in the four sub-indices (B, C, E and F) as also >>> supported by the high level of missing values in the variables used in >>> computing those indices. >>> >>> I was therefore wondering whether an explanation like this in the >>> appendix of my work will be fine or I will need to do multiple >>> imputing to replace the missing data. >>> >>> I have thought through this and the question am asking myself is that >>> if have to do multiple imputation, the variables to for the imputation >>> exercise will be the B variables (these are decision-making >>> variables), then the E variables (these are number of wives, age at >>> first marriage, women's age, partners age) and then F3 and F4 (which >>> are partner's education and whether a woman earns cash). >>> >>> My worry is whether it will be sensible to impute variables such as >>> age and number of wives? Secondly considering that I still have a >>> large sample size to work with, y guess is that the results from the >>> remaining sample will not change that much. Thus am wandering whether >>> it will still be necessary to impute the missing data >>> >>> I will appreciate to hear from you on this so Will know which way to >>> go. Thank you very much. >>> >>> Regards >>> >>> Gordon >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > > > -- > ********************************************** > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Missing Observations. Do I need multiple Imputations?***From:*Abekah Nkrumah <ankrumah@gmail.com>

**Re: st: Missing Observations. Do I need multiple Imputations?***From:*A Loumiotis <antonis.loumiotis@gmail.com>

**Re: st: Missing Observations. Do I need multiple Imputations?***From:*Abekah Nkrumah <ankrumah@gmail.com>

- Prev by Date:
**Re: Re: st: Out-of-sample forecasting using OLS regression** - Next by Date:
**st: From: Jamie Madden <jamiem1234@gmail.com>** - Previous by thread:
**Re: st: Missing Observations. Do I need multiple Imputations?** - Next by thread:
**Re: st: Missing Observations. Do I need multiple Imputations?** - Index(es):