# st: RE: Ungrouping income data

 From "Nick Cox" To Subject st: RE: Ungrouping income data Date Fri, 4 Feb 2005 18:09:18 -0000

```I doubt that this is canned, but a more
immediate problem is precisely what you want.

The median is in the middle of the distribution;
the precise form of the rest of the distribution
is irrelevant. So you need to find the middle
category, which -summarize- will do, and then
decide how you want to interpolate between its
boundaries. Linear might be good enough, or
you might want something fancier. E.g. suppose
the fourth category contains the median,
and that its boundaries are 10,000 and 15,000
and that 40% are below 10,000 and 40% are above
15,000. A linear interpolation gives 12,500
as the median.

If you want some measure of the level of
the distribution that takes the rest of
the distribution into account in some other
way, it can't be a median, but must be
something else.

As for the white magic of deciding what
the upper open category means, I guess
there are many ways to do it, perhaps
each based on fitting a specified
distribution from selected quantiles.
I doubt that there is a single way to do it.

Nick
n.j.cox@durham.ac.uk

Alan Acock

> I have data on income grouped in ranges (0-29,999) ...
> (100,000 and over).
> Is there a program in Stata that will convert these ranges to
> point values
> (medians)? Is there a way that takes the distribution into
> account? Is there
> a way to get a value for the end category of 100,000 or more
> that takes the
> distribution into account. I know demographers have formulas
> for this, but
> has anybody put these into a Stata program?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```