Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Computing the Gini or another inequality coefficient from a limited number of data points

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Computing the Gini or another inequality coefficient from a limited number of data points Date Fri, 10 Feb 2012 10:30:00 -0500

```Jen Zhen <jenzhen99@gmail.com>:
ssc install dagfit
help dagfit

for a start, then read the items referred to in that help file.

On Fri, Feb 10, 2012 at 3:52 AM, Jen Zhen <jenzhen99@gmail.com> wrote:
> Dear list members,
>
> I would like to compute a measure of income inequality similar to the
> Gini index. I do not know everyone's income, so need to make an
> approximation.
>
> (1)
> For the 5 most recent years, I know for 6 income brackets how many
> individuals there are and their joint income, hence also the average
> income in the bracket. For the full-fledged Gini index I would need to
> know the area under the curve which shows the cumulative income
> against the cumulative number of tax payers (to visualize what I mean,
> look e.g. at the 2nd figure here:
> http://en.wikipedia.org/wiki/Gini_index).
> Now I believe that with the information I have I don't know the entire
> curve but I know only 7 points on it (the six points mentioned plus
> the origin). So I think I can approximate the said area if I simply
> assume that between the 7 points the line is straight, but that will
> systematically underestimate the true degree of inequality. So I'm
> wondering if there is a sensible way to smooth the curve and hence get
> a better approximation?
>
> (2)
> For the 5 earliest years unfortunately I know only the number of
> individuals in each bracket but not their joint income. So my idea was
> that I would regress the mean income in each bracket on a 3rd-order
> function in the year to see how it develops in the 5 latest years and
> use this to predict/estimate the mean income for each bracket in the 5
> earlier years, then use the procedure described in (1). A simpler
> alternative would be to just use the midpoint of each bracket, but I
> guess this would be less good.
>
> Does this procedure sound sensible? Or is there a better way to
> compute inequality from these data?
>
> Thank you so much and best regards,
> JZ
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```