Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Ungrouping income data

From   "Nick Cox" <>
To   <>
Subject   st: RE: Ungrouping income data
Date   Fri, 4 Feb 2005 18:09:18 -0000

I doubt that this is canned, but a more 
immediate problem is precisely what you want. 

The median is in the middle of the distribution; 
the precise form of the rest of the distribution
is irrelevant. So you need to find the middle 
category, which -summarize- will do, and then 
decide how you want to interpolate between its 
boundaries. Linear might be good enough, or 
you might want something fancier. E.g. suppose
the fourth category contains the median, 
and that its boundaries are 10,000 and 15,000
and that 40% are below 10,000 and 40% are above
15,000. A linear interpolation gives 12,500
as the median. 

If you want some measure of the level of 
the distribution that takes the rest of 
the distribution into account in some other 
way, it can't be a median, but must be 
something else. 

As for the white magic of deciding what 
the upper open category means, I guess
there are many ways to do it, perhaps 
each based on fitting a specified 
distribution from selected quantiles. 
I doubt that there is a single way to do it. 


Alan Acock
> I have data on income grouped in ranges (0-29,999) ... 
> (100,000 and over).
> Is there a program in Stata that will convert these ranges to 
> point values
> (medians)? Is there a way that takes the distribution into 
> account? Is there
> a way to get a value for the end category of 100,000 or more 
> that takes the
> distribution into account. I know demographers have formulas 
> for this, but
> has anybody put these into a Stata program?

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index