[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
René Wevers <rene@ricardis.tudelft.nl> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Issue with multiple imputation -ICE- |

Date |
Wed, 13 Feb 2008 19:35:27 +0100 |

First of all many thanks for all the replies. I will check the distribution of the (employment) size of the companies in my sample. However it does not seem very likely to me that this distribution will prove disturbingly nonnormal. My sample is a selection, namely all the 'product innovators', of the Dutch version of the Community Innovation Survey (CIS). The data was gathered by the Dutch central bureau of statistics and they always (deliberately) create a sample with a large spread of sizes of companies. Besides the fact that I don't expect the distribution to be very skewed or otherwise disturbed, there is also the fact that of the 5500 missings I imputed, more then 2000 turned out (strongly) negative. Thus it seems that we either completely neglect some assumptions for -ice-, the data is not appropriate for -ice- or something is going internally wrong with -ice-. Since the data is only accessible at the bureau of statistics I can apply your comments this Friday at the earliest. Many thanks again and greetings, René -----Oorspronkelijk bericht----- Van: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Namens Nick Cox Verzonden: Wednesday, February 13, 2008 7:00 PM Aan: statalist@hsphsun2.harvard.edu Onderwerp: RE: st: Issue with multiple imputation -ICE- Control over how percentile ranks (also no doubt known under other names) are calculated is easily possible, as detailed within FAQ Calculating percentile ranks or plotting positions 7/02 How can I calculate percentile ranks? How can I calculate plotting positions? http://www.stata.com/support/faqs/stat/pcrank.html Then you need just one more function to get normal scores. Maarten buis --- Mark Lunt <Mark.Lunt@manchester.ac.uk> wrote: > ICE assumes that continuous variables are normally distributed: if > that is not the case, impossible values can appear. In particular, if > you have lots of companies with a few employees and a few companies > with lots of employees, ICE will impute negative numbers of > employees. One possible solution is to use the "match" option of ICE. Good point. An alternative would be to take the logarithm of the number of employees. > Alternatively, I have written some ado-files which convert variables > to normal-scores and back: you can convert to normal scores (which > are normally distributed), perform the imputation on these > variables, then convert back to your original distribution. I have had a quick look at this command and it would seem that you use the rank of each observation and transform that as if it came from a normal distribution. I think that that is too strong a transformation, as you throw away all information about the distances between values and only use the rank. This is most clearly visible when two or more observations have the same value. In the way you programed this procedure they are given different ranks, and thus different values on your new variable: *--------- begin example --------- sysuse auto, clear nscore rep78, gen(gauss) twoway scatter gauss rep78 *---------- end example ---------- * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Best method for imputing dichotomous variables***From:*René Wevers <rene@ricardis.tudelft.nl>

**References**:**RE: st: Issue with multiple imputation -ICE-***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Moran I and spatial correleation help needed** - Next by Date:
**st: RE: RE: R: Moran I and spatial correleation help needed** - Previous by thread:
**RE: st: Issue with multiple imputation -ICE-** - Next by thread:
**st: Best method for imputing dichotomous variables** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |