Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Mean of interval-censored data

 From To Subject st: Mean of interval-censored data Date Mon, 28 Nov 2011 09:11:37 -0000

```------------------------------

Date: Sun, 27 Nov 2011 11:28:58 -0600
From: Paul von Hippel <paulvonhippel.utaustin@gmail.com>
Subject: st: Mean of interval-censored data

I am interested in using the intcens command for Stata to model an
program handles weights, and on whether it can output expected values.

I have interval-censored data on the distribution of family income
within various school districts. The data for one district are below
my signature. bin_min and bin_max are the endpoints of the interval
(the top interval has only one endpoint), and fb is the number of
families in the interval. I would like to estimate the distribution of
income and derived quantities, most importantly the mean.

Two questions, if I may:
1. Will intcens provide me the mean income, or will I need to
calculate it from the parameters of the distribution?
2. It looks to me as though intcens can handle weights through a
command like this: "intcens bin_min bin_max fb". But if this is the
syntax, how can intcens tell that fb is a weight and not a regressor?

Finally: Is there a different command that I should be using for this
purpose?

Best wishes,
Paul von Hippel

fb    bin_min    bin_max
21          0      10000
22      10000      14999
29      15000      19999
105      20000      24999
80      25000      29999
155      30000      34999
159      35000      39999
68      40000      44999
138      45000      49999
210      50000      59999
264      60000      74999
324      75000      99999
123     100000     124999
129     125000     149999
75     150000     199999
110     200000          .

================================

You could use -intcens- but the distributions that it fits are not ones
that are commonly used to describe income distributions. I suggest that
you instead look at something like -gbgfit- (by Austin Nichols on SSC),
as this handles interval-censored (grouped) data. You'd have to recode
your bin boundary variables slightly to make the program work (see the
help file).
If it doesn't take weights then you should be able to fudge this
pre-estimation by multiplying your frequency variable ("fb"?) by your
weights. [Of course this is frequency weighting; not accounting for
design weights etc.]
Once you have the parameter estimates, you'd be able to calculate a
number of distributional summary statistics from the saved results. (See
also my -gb2fit- on SSC.)

Stephen
------------------
Professor Stephen P. Jenkins <s.jenkins@lse.ac.uk>
Department of Social Policy and STICERD
London School of Economics and Political Science
Houghton Street, London WC2A 2AE, UK
Tel: +44(0)20 7955 6527
Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
2011, http://ukcatalogue.oup.com/product/9780199226436.do
Survival Analysis Using Stata:
http://www.iser.essex.ac.uk/survival-analysis