Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Vladimír Hlásny <vhlasny@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: GMM minimization of regional errors imputed from hhd level model |
Date | Sat, 29 Jun 2013 11:08:47 +0900 |
Thanks for responding, Austin. The full reference is: Korinek, Mistiaen and Ravallion (2007), An econometric method of correcting for unit nonresponse bias in surveys, J. of Econometrics 136. My sample includes 12000 responding households. I know their income, and which of 2500 regions they come from. In addition, for each region, I know the number of non-responding households. I find the coefficient on income by fitting estimated regional population to actual population: P_i = logit f(income_i,theta) actual_j = responding_j + nonresponding_j theta = argmin {sum(1/P_i) - actual_j} Response probability may not be monotonic in income. The logit may be a non-monotonic function of income. Thanks for any thoughts on how to estimate this in Stata, or how to make my 'trick' (setting 12000-2500 hhd-level residuals manually to zero) work better. Vladimir On Sat, Jun 29, 2013 at 1:49 AM, Austin Nichols <austinnichols@gmail.com> wrote: > Vladimír Hlásny <vhlasny@gmail.com>: > As the FAQ hints, if you don't provide full references, don't expect > good answers. > > I don't understand your description--how are you running a logit of > response on income, when you only have income for responders? Can you > give a sense of what the data looks like? > > On another topic, why would anyone expect response probability to be > monotonic in income? > > On Fri, Jun 28, 2013 at 10:05 AM, Vladimír Hlásny <vhlasny@gmail.com> wrote: >> Hi, >> I am using a method by Korinek, Mistiaen and Ravallion (2007) to >> correct for unit-nonresponse bias. That involves estimating >> response-probability for each household, inferring regional >> population from these probabilities, and fitting against actual >> regional populations. I must use household-level data and region-level >> data simultaneously, because coefficients in the household-level model >> are adjusted based on fit of the regional-level populations. >> >> I used a trick - manually resetting residuals of all but >> one-per-region household - but this trick doesn't produce perfect >> results. Please find the details, remaining problems, as well as the >> Stata code described below. Any thoughts on this? >> >> Thank you for any suggestions! >> >> Vladimir Hlasny >> Ewha Womans University >> Seoul, Korea >> >> Details: >> I am estimating households' probability to respond to a survey as a >> function of their income. For each responding household (12000), I >> have data on income. Also, at the level of region (3000), I know the >> number of responding and non-responding households. >> >> I declare a logit equation of response-probability as a function of >> income, to estimate it for all responding households. >> >> The identification is provided by fitting of population in each >> region. For each responding household, I estimate their true mass as >> the inverse of their response probability. Then I sum the >> response-probabilities for all households in a region, and fit it >> against the true population. >> >> Stata problem: >> I am estimating GMM at the regional level. But, to obtain the >> population estimate in each region, I calculate response-probabilities >> at the household level and sum them up in a region. This region-level >> fitting and response-probability estimation occurs >> simultaneously/iteratively -- as logit-coefficients are adjusted to >> minimize region-level residuals, households response-probabilities >> change. >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/