Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joerg Luedicke <joerg.luedicke@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Computing the Gini or another inequality coefficient from a limited number of data points |

Date |
Fri, 10 Feb 2012 09:32:19 -0500 |

Daniel, do you mean a power law distribution? (Just had a quick glance at your paper and saw that you were talking about Pareto distributions there. The Poisson distribution has only one parameter.) Joerg On Fri, Feb 10, 2012 at 6:48 AM, Daniel Feenberg <feenberg@nber.org> wrote: > > On Fri, 10 Feb 2012, Jen Zhen wrote: > >> Dear list members, >> >> I would like to compute a measure of income inequality similar to the >> Gini index. I do not know everyone's income, so need to make an >> approximation. >> >> (1) >> For the 5 most recent years, I know for 6 income brackets how many >> individuals there are and their joint income, hence also the average >> income in the bracket. For the full-fledged Gini index I would need to >> know the area under the curve which shows the cumulative income >> against the cumulative number of tax payers (to visualize what I mean, >> look e.g. at the 2nd figure here: >> http://en.wikipedia.org/wiki/Gini_index). >> Now I believe that with the information I have I don't know the entire >> curve but I know only 7 points on it (the six points mentioned plus >> the origin). So I think I can approximate the said area if I simply >> assume that between the 7 points the line is straight, but that will >> systematically underestimate the true degree of inequality. So I'm >> wondering if there is a sensible way to smooth the curve and hence get >> a better approximation? >> >> (2) >> For the 5 earliest years unfortunately I know only the number of >> individuals in each bracket but not their joint income. So my idea was >> that I would regress the mean income in each bracket on a 3rd-order >> function in the year to see how it develops in the 5 latest years and >> use this to predict/estimate the mean income for each bracket in the 5 >> earlier years, then use the procedure described in (1). A simpler >> alternative would be to just use the midpoint of each bracket, but I >> guess this would be less good. >> >> Does this procedure sound sensible? Or is there a better way to >> compute inequality from these data? >> > > You could assume an income distribution function, such as log-normal, > poisson, or even Gini and solve for the parameters using the available data. > We do this in "Income Inequality and the Incomes of Very High-Income > Taxpayers: Evidence from Tax Returns" > > http://www.nber.org/chapters/c10880 > > with a Poisson. The Poisson is a two parameter distribution, so we solve for > the parameters within an income bracket using only the 2 breakpoints that > define the bracket. That way there is no need to extrapolate beyond the > observed data, or coerce data everywhere in the income distribution to a > small number of parameters estimated over the whole distribution. > > Daniel Feenberg > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Computing the Gini or another inequality coefficient from a limited number of data points***From:*Jen Zhen <jenzhen99@gmail.com>

**Re: st: Computing the Gini or another inequality coefficient from a limited number of data points***From:*Daniel Feenberg <feenberg@nber.org>

- Prev by Date:
**RE: st: Interaction model** - Next by Date:
**RE: st: RE: rearrange table** - Previous by thread:
**Re: st: Computing the Gini or another inequality coefficient from a limited number of data points** - Next by thread:
**Re: st: Computing the Gini or another inequality coefficient from a limited number of data points** - Index(es):