Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Mean of interval-censored data

From	<[email protected]>
To	<[email protected]>
Subject	st: Mean of interval-censored data
Date	Mon, 28 Nov 2011 09:11:37 -0000

------------------------------

Date: Sun, 27 Nov 2011 11:28:58 -0600
From: Paul von Hippel <[email protected]>
Subject: st: Mean of interval-censored data

I am interested in using the intcens command for Stata to model an
income distribution. I wonder if I can ask for your advice on how the
program handles weights, and on whether it can output expected values.
Many thanks for your time.

I have interval-censored data on the distribution of family income
within various school districts. The data for one district are below
my signature. bin_min and bin_max are the endpoints of the interval
(the top interval has only one endpoint), and fb is the number of
families in the interval. I would like to estimate the distribution of
income and derived quantities, most importantly the mean.

Two questions, if I may:
1. Will intcens provide me the mean income, or will I need to
calculate it from the parameters of the distribution?
2. It looks to me as though intcens can handle weights through a
command like this: "intcens bin_min bin_max fb". But if this is the
syntax, how can intcens tell that fb is a weight and not a regressor?

Finally: Is there a different command that I should be using for this
purpose?
Thanks for any advice.

Best wishes,
Paul von Hippel

fb    bin_min    bin_max
21          0      10000
22      10000      14999
29      15000      19999
105      20000      24999
80      25000      29999
155      30000      34999
159      35000      39999
68      40000      44999
138      45000      49999
210      50000      59999
264      60000      74999
324      75000      99999
123     100000     124999
129     125000     149999
75     150000     199999
110     200000          .

================================

You could use -intcens- but the distributions that it fits are not ones
that are commonly used to describe income distributions. I suggest that
you instead look at something like -gbgfit- (by Austin Nichols on SSC),
as this handles interval-censored (grouped) data. You'd have to recode
your bin boundary variables slightly to make the program work (see the
help file).
If it doesn't take weights then you should be able to fudge this
pre-estimation by multiplying your frequency variable ("fb"?) by your
weights. [Of course this is frequency weighting; not accounting for
design weights etc.]
Once you have the parameter estimates, you'd be able to calculate a
number of distributional summary statistics from the saved results. (See
also my -gb2fit- on SSC.)

Stephen
------------------
Professor Stephen P. Jenkins <[email protected]>
Department of Social Policy and STICERD
London School of Economics and Political Science
Houghton Street, London WC2A 2AE, UK
Tel: +44(0)20 7955 6527
Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP
2011, http://ukcatalogue.oup.com/product/9780199226436.do
Survival Analysis Using Stata:
http://www.iser.essex.ac.uk/survival-analysis
Downloadable papers and software: http://ideas.repec.org/e/pje7.html

Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: How to figure out the most common value
Next by Date: Re: st: How to figure out the most common value
Previous by thread: st: Mean of interval-censored data
Next by thread: st: xtset and xtreg for panel data
Index(es):
- Date
- Thread