Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Imputation using ML for a lognormal ordered income variable

From   Tinna Asgeirsdottir <>
Subject   Re: st: Imputation using ML for a lognormal ordered income variable
Date   Mon, 19 Nov 2012 15:34:50 +0000

Thanks for the helpful reply Stas,

I don´t think the recommendation referred to interval regression or
multiple imputation. I think it referred to imputing the probable
average or median of each category, but without the obviously false
assumption of a uniform distribution within each category the midpoint
would suggest.

If I do a ML fit of a lognormal distribution using the lognfit command
I can get the parameters of the distribution. I guess I should be able
to work this out by hand from there, but figured that there might be
an easier way.


2012/11/17 Stas Kolenikov <>:
> Lognormal distribution will likely underestimate how heavy the top
> tail is (although if you are interested in Iceland, you may have a
> very egalitarian income distribution, so the shape of that tail may
> not be that terrible). Lognormal distribution is a very cute model to
> play with and very dangerous in real work. In my work on Russian data,
> changing the assumptions about the top tail moved our Gini index from
> 0.48 to 0.60... and that's a little bit of a difference, let's put it
> this way.
> The recommendation you have heard probably concerns -intreg-, which
> you can read the help on.
> Imputing the mean income over a group will lead to a multitude of
> problems due to artificially compressed variability and values that
> are simply too low for the top group. If you desperately need to
> impute, you would want to go with multiple imputations (-help mi-),
> although you would want to read the MI manual and a paper
> ( or two
> ( if you are not familiar with the
> technique. What I have done in one of my projects recently was to
> generate the plausible values of the variable of interest a bunch of
> times (say, 50... the original suggestion to use 5 imputations dates
> back to late 1970s... and your smartphone now has more computing power
> than a then-Cray supercomputer) and make Stata believe they were
> imputed in Stata mi wide format.
> --
> -- Stas Kolenikov, PhD, PStat (SSC)  ::
> -- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
> srbi dot com
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> On Sat, Nov 17, 2012 at 6:12 AM, Tinna Asgeirsdottir
> <> wrote:
>> Dear Stata users,
>> In my data I have income in 13 groups. The top group is open ended. I
>> am trying to impute sensible values and would like to use this as a
>> continuous variable. I am especially concerned about the top category.
>>  It has been suggested to me that I should use STATA´s ML command in
>> stead of using each categories mid-point. I am having trouble finding
>> what I need on the internet. Thus I wonder if anyone can tell me how
>> to fit a lognormal distribution to the variable and subsequently infer
>> the average income in the top bracket. If you know how to do this in
>> general for all the categories that is great as well as the
>> distributions over the other brackets is surely not uniform. However,
>> I think finding a good solution for my top category is the most
>> important thing though.
>> Best regards,
>> Tinna
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index