Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Imputing for missing proportions


From   Alan Acock <acock@me.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Imputing for missing proportions
Date   Fri, 12 Apr 2013 07:44:54 -0700

Nick is right that missing at random is a tough assumption, but it is easier than missing completely at random that is needed by listwise/case wise deletion. 
Alan Acock

Sent from my iPad

On Apr 12, 2013, at 3:49 AM, Nick Cox <njcoxstata@gmail.com> wrote:

> Well, imputation of missing values is vastly oversold any way. Missing
> at random? I don't (usually) believe it. (Highly unofficial opinion.)
> Nick
> njcoxstata@gmail.com
> 
> 
> On 12 April 2013 11:44, Geomina Turlea <geomina@yahoo.fr> wrote:
>> I know, but - mi impute- does not support glm either
>> 
>> _________________________________________Geomina Turlea
>> TODO AQUEL QUE SUEÑA SE CONVIERTE EN ARTISTA
>> 
>> 
>> --- On Fri, 4/12/13, Nick Cox <njcoxstata@gmail.com> wrote:
>> 
>>> From: Nick Cox <njcoxstata@gmail.com>
>>> Subject: Re: st: Imputing for missing proportions
>>> To: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
>>> Date: Friday, April 12, 2013, 1:35 PM
>>> I haven't looked at whether it mixes
>>> with -mi-, but -glm- with
>>> -link(logit)- is a standard way to handle continuous
>>> proportions.
>>> 
>>> Nick
>>> njcoxstata@gmail.com
>>> 
>>> 
>>> On 12 April 2013 11:08, Geomina Turlea <geomina@yahoo.fr>
>>> wrote:
>>>> Maarten,
>>>> Thank you very much for your answer.
>>>> The problem with -mi impute - is that it does not
>>> really have an option for regressing proportions. I can't
>>> really use truncated regression, and my dependent variable
>>> is not binary or categorial, but a continous variable betwen
>>> 0 and 1.
>>>> I am considering to simulate the multiple imputation
>>> with a beta regression for estimation of the missing
>>> values.
>>>> Very gratefull for an yes/no opinion on this,
>>>> Geomina
>>>> 
>>>> 
>>>> --- On Thu, 4/11/13, Maarten Buis <maartenlbuis@gmail.com>
>>> wrote:
>>>> 
>>>>> From: Maarten Buis <maartenlbuis@gmail.com>
>>> 
>>> Geomina Turlea wrote:
>>> 
>>>>>> I am fighting for a while with estimate
>>> missing data
>>>>> for the share of ICT professionals/total
>>> employment, in 59
>>>>> industries, 27 EU countries and for 14 years.
>>>>>> This data exists in the European Labour Force
>>> Survey,
>>>>> but the dataset is incomplete.
>>>>>> 
>>>>>> 1. Can I use mi impute with proportions?
>>>>>> 2. I used betafit to fit a distribution with
>>> values
>>>>> between 0 and 1. Than I imputed the missing values
>>> from the
>>>>> estimated beta distribution. Is this method
>>>>> superior/inferior to using mi impute?
>>>>>> 3. I tried to use the Kolmogorov-Smirnov test,
>>> but I
>>>>> don't know what I got wrong. Below is a sequence
>>> where I
>>>>> created a variable with the distribution beta and
>>> then test
>>>>> the hypothesis with the K-S test. The test rejects
>>> the null
>>>>> hypothesis that the data has the distribution I
>>> used to
>>>>> create it. How could that be?
>>>>>> 
>>>>>> . gen x=rbeta(0.05, 1.77)
>>>>>> . ksmirnov x=rbeta(0.05, 1.77)
>>> 
>>>>> My first step would be to look at the industries
>>> with
>>>>> missing values.
>>>>> Sometimes missing just means 0 or negligable, and
>>> looking at
>>>>> the
>>>>> industries would give you a fair guess of whether
>>> that is
>>>>> the case. If
>>>>> that is the case your imputation problem reduces to
>>> just a
>>>>> recoding
>>>>> problem.
>>>>> 
>>>>> For questions 2 and 3: If you have an imputation
>>> problem,
>>>>> then you
>>>>> should use -mi- and not -betafit- (available from
>>> SSC),
>>>>> because that
>>>>> is what -mi- was designed for.
>>>>> 
>>>>> For question 3: -rbeta()- gives you random numbers
>>> from a
>>>>> beta
>>>>> distribution, so that is definately not something
>>> you want
>>>>> to feed in
>>>>> -ksmirnov-. I just would use either -margdistfit-
>>> or
>>>>> -hangroot- (also
>>>>> available from SSC) after -betafit- to check the
>>> fit.
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index