Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Imputing Mean of Top-Coded Income Category

From   "Joshua A. Guetzkow" <joshg@Princeton.EDU>
Subject   st: Imputing Mean of Top-Coded Income Category
Date   Sun, 14 Jul 2002 18:20:37 -0400 (EDT)


I have top-coded, continuous CPS data on earnings. I want to impute the
mean income of this group of top-coded earners, making the assumption that
the upper-tail follows a pareto distribution. I'm wondering if anyone has
suggestions about how to do this in STATA (or even just generally how to
do it).

Some notes:

The standard method of doing this typically involves imputing the mean of
top-coded earners given categories of earnings, using the following

Mean Income for top-coded category = X(V/V-1)
X = topcode/open-ended category
V = c-d/b-a
a = log of lower limit of interval preceding top-coded/open-ended category
b = log of lower limit of top-coded/open-ended category
c = log of the sum of the frequencies in the top-coded category and the
category preceding it
d = log of the frequencies in the top-coded category

The problem with using this method given continuous earnings data (like
the CPS) is that the result is highly dependent on the choice one makes
about what interval to define as the "preceding category."

Another method would use the mode and median to solve the equation:

median = mode * 2 (to the 1/V power)

(using the observed median and mode of the sample to calculate V and solve
the equation above)

The problem here is that when the median is less than the mode, it gives a
value for V less and 1, such that multiplying the top code gives a mean
for the top-coded income that is LESS than the top code, much to my

Any help on this would be much appreciated!

Josh Guetzkow

Princeton University
Dept. of Sociology
Wallace Hall
Princeton, NJ 08544

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index