Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Imputing for missing proportions

From   Nick Cox <>
To   "" <>
Subject   Re: st: Imputing for missing proportions
Date   Fri, 12 Apr 2013 11:35:35 +0100

I haven't looked at whether it mixes with -mi-, but -glm- with
-link(logit)- is a standard way to handle continuous proportions.


On 12 April 2013 11:08, Geomina Turlea <> wrote:
> Maarten,
> Thank you very much for your answer.
> The problem with -mi impute - is that it does not really have an option for regressing proportions. I can't really use truncated regression, and my dependent variable is not binary or categorial, but a continous variable betwen 0 and 1.
> I am considering to simulate the multiple imputation with a beta regression for estimation of the missing values.
> Very gratefull for an yes/no opinion on this,
> Geomina
> --- On Thu, 4/11/13, Maarten Buis <> wrote:
>> From: Maarten Buis <>

Geomina Turlea wrote:

>> > I am fighting for a while with estimate missing data
>> for the share of ICT professionals/total employment, in 59
>> industries, 27 EU countries and for 14 years.
>> > This data exists in the European Labour Force Survey,
>> but the dataset is incomplete.
>> >
>> > 1. Can I use mi impute with proportions?
>> > 2. I used betafit to fit a distribution with values
>> between 0 and 1. Than I imputed the missing values from the
>> estimated beta distribution. Is this method
>> superior/inferior to using mi impute?
>> > 3. I tried to use the Kolmogorov-Smirnov test, but I
>> don't know what I got wrong. Below is a sequence where I
>> created a variable with the distribution beta and then test
>> the hypothesis with the K-S test. The test rejects the null
>> hypothesis that the data has the distribution I used to
>> create it. How could that be?
>> >
>> > . gen x=rbeta(0.05, 1.77)
>> > . ksmirnov x=rbeta(0.05, 1.77)

>> My first step would be to look at the industries with
>> missing values.
>> Sometimes missing just means 0 or negligable, and looking at
>> the
>> industries would give you a fair guess of whether that is
>> the case. If
>> that is the case your imputation problem reduces to just a
>> recoding
>> problem.
>> For questions 2 and 3: If you have an imputation problem,
>> then you
>> should use -mi- and not -betafit- (available from SSC),
>> because that
>> is what -mi- was designed for.
>> For question 3: -rbeta()- gives you random numbers from a
>> beta
>> distribution, so that is definately not something you want
>> to feed in
>> -ksmirnov-. I just would use either -margdistfit- or
>> -hangroot- (also
>> available from SSC) after -betafit- to check the fit.
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index