Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Computing the Gini or another inequality coefficient from a limited number of data points

From   Jen Zhen <>
Subject   Re: st: Computing the Gini or another inequality coefficient from a limited number of data points
Date   Fri, 10 Feb 2012 17:00:09 +0100

Dear Daniel and Austin,

thanks a lot for your suggestions, these look very helpful and I will
now look into them in detail!


On Fri, Feb 10, 2012 at 4:30 PM, Austin Nichols <> wrote:
> Jen Zhen <>:
> ssc install dagfit
> help dagfit
> for a start, then read the items referred to in that help file.
> On Fri, Feb 10, 2012 at 3:52 AM, Jen Zhen <> wrote:
>> Dear list members,
>> I would like to compute a measure of income inequality similar to the
>> Gini index. I do not know everyone's income, so need to make an
>> approximation.
>> (1)
>> For the 5 most recent years, I know for 6 income brackets how many
>> individuals there are and their joint income, hence also the average
>> income in the bracket. For the full-fledged Gini index I would need to
>> know the area under the curve which shows the cumulative income
>> against the cumulative number of tax payers (to visualize what I mean,
>> look e.g. at the 2nd figure here:
>> Now I believe that with the information I have I don't know the entire
>> curve but I know only 7 points on it (the six points mentioned plus
>> the origin). So I think I can approximate the said area if I simply
>> assume that between the 7 points the line is straight, but that will
>> systematically underestimate the true degree of inequality. So I'm
>> wondering if there is a sensible way to smooth the curve and hence get
>> a better approximation?
>> (2)
>> For the 5 earliest years unfortunately I know only the number of
>> individuals in each bracket but not their joint income. So my idea was
>> that I would regress the mean income in each bracket on a 3rd-order
>> function in the year to see how it develops in the 5 latest years and
>> use this to predict/estimate the mean income for each bracket in the 5
>> earlier years, then use the procedure described in (1). A simpler
>> alternative would be to just use the midpoint of each bracket, but I
>> guess this would be less good.
>> Does this procedure sound sensible? Or is there a better way to
>> compute inequality from these data?
>> Thank you so much and best regards,
>> JZ
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index