------------------------------ Date: Sun, 27 Nov 2011 11:28:58 -0600 From: Paul von Hippel <paulvonhippel.utaustin@gmail.com> Subject: st: Mean of interval-censored data I am interested in using the intcens command for Stata to model an income distribution. I wonder if I can ask for your advice on how the program handles weights, and on whether it can output expected values. Many thanks for your time. I have interval-censored data on the distribution of family income within various school districts. The data for one district are below my signature. bin_min and bin_max are the endpoints of the interval (the top interval has only one endpoint), and fb is the number of families in the interval. I would like to estimate the distribution of income and derived quantities, most importantly the mean. Two questions, if I may: 1. Will intcens provide me the mean income, or will I need to calculate it from the parameters of the distribution? 2. It looks to me as though intcens can handle weights through a command like this: "intcens bin_min bin_max fb". But if this is the syntax, how can intcens tell that fb is a weight and not a regressor? Finally: Is there a different command that I should be using for this purpose? Thanks for any advice. Best wishes, Paul von Hippel fb bin_min bin_max 21 0 10000 22 10000 14999 29 15000 19999 105 20000 24999 80 25000 29999 155 30000 34999 159 35000 39999 68 40000 44999 138 45000 49999 210 50000 59999 264 60000 74999 324 75000 99999 123 100000 124999 129 125000 149999 75 150000 199999 110 200000 . ================================ You could use -intcens- but the distributions that it fits are not ones that are commonly used to describe income distributions. I suggest that you instead look at something like -gbgfit- (by Austin Nichols on SSC), as this handles interval-censored (grouped) data. You'd have to recode your bin boundary variables slightly to make the program work (see the help file). If it doesn't take weights then you should be able to fudge this pre-estimation by multiplying your frequency variable ("fb"?) by your weights. [Of course this is frequency weighting; not accounting for design weights etc.] Once you have the parameter estimates, you'd be able to calculate a number of distributional summary statistics from the saved results. (See also my -gb2fit- on SSC.) Stephen ------------------ Professor Stephen P. Jenkins <s.jenkins@lse.ac.uk> Department of Social Policy and STICERD London School of Economics and Political Science Houghton Street, London WC2A 2AE, UK Tel: +44(0)20 7955 6527 Changing Fortunes: Income Mobility and Poverty Dynamics in Britain, OUP 2011, http://ukcatalogue.oup.com/product/9780199226436.do Survival Analysis Using Stata: http://www.iser.essex.ac.uk/survival-analysis Downloadable papers and software: http://ideas.repec.org/e/pje7.html

