Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Computing the Gini or another inequality coefficient from a limited number of data points

From	Daniel Feenberg <[email protected]>
To	[email protected]
Subject	Re: st: Computing the Gini or another inequality coefficient from a limited number of data points
Date	Fri, 10 Feb 2012 06:48:53 -0500 (EST)


On Fri, 10 Feb 2012, Jen Zhen wrote:

Dear list members,

I would like to compute a measure of income inequality similar to the
Gini index. I do not know everyone's income, so need to make an
approximation.

(1)
For the 5 most recent years, I know for 6 income brackets how many
individuals there are and their joint income, hence also the average
income in the bracket. For the full-fledged Gini index I would need to
know the area under the curve which shows the cumulative income
against the cumulative number of tax payers (to visualize what I mean,
look e.g. at the 2nd figure here:
http://en.wikipedia.org/wiki/Gini_index).
Now I believe that with the information I have I don't know the entire
curve but I know only 7 points on it (the six points mentioned plus
the origin). So I think I can approximate the said area if I simply
assume that between the 7 points the line is straight, but that will
systematically underestimate the true degree of inequality. So I'm
wondering if there is a sensible way to smooth the curve and hence get
a better approximation?

(2)
For the 5 earliest years unfortunately I know only the number of
individuals in each bracket but not their joint income. So my idea was
that I would regress the mean income in each bracket on a 3rd-order
function in the year to see how it develops in the 5 latest years and
use this to predict/estimate the mean income for each bracket in the 5
earlier years, then use the procedure described in (1). A simpler
alternative would be to just use the midpoint of each bracket, but I
guess this would be less good.

Does this procedure sound sensible? Or is there a better way to
compute inequality from these data?

You could assume an income distribution function, such as log-normal,poisson, or even Gini and solve for the parameters using the availabledata. We do this in "Income Inequality and the Incomes of Very High-IncomeTaxpayers: Evidence from Tax Returns"


  http://www.nber.org/chapters/c10880

with a Poisson. The Poisson is a two parameter distribution, so we solvefor the parameters within an income bracket using only the 2 breakpointsthat define the bracket. That way there is no need to extrapolate beyondthe observed data, or coerce data everywhere in the income distributionto a small number of parameters estimated over the whole distribution.


Daniel Feenberg
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Computing the Gini or another inequality coefficient from a limited number of data points
  - From: Joerg Luedicke <[email protected]>

References:
- st: Computing the Gini or another inequality coefficient from a limited number of data points
  - From: Jen Zhen <[email protected]>

Prev by Date: Re: st: RE: rearrange table
Next by Date: st: RE: RE: rearrange table
Previous by thread: st: Computing the Gini or another inequality coefficient from a limited number of data points
Next by thread: Re: st: Computing the Gini or another inequality coefficient from a limited number of data points
Index(es):
- Date
- Thread