Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: GMM minimization of regional errors imputed from hhd level model

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: GMM minimization of regional errors imputed from hhd level model Date Sat, 29 Jun 2013 09:27:40 -0400

```Vladimír Hlásny <vhlasny@gmail.com>:
I have not read the ref.  But you do not really have instruments. That
is, you are not setting E(Ze) to zero with e a residual from some
equation and Z your instrument; you do not have moments of that type.
just minimizing the sum of squared deviations from targets at the
region level. Or am I still misunderstanding this exercise?

On Fri, Jun 28, 2013 at 10:08 PM, Vladimír Hlásny <vhlasny@gmail.com> wrote:
> Thanks for responding, Austin.
>
> The full reference is: Korinek, Mistiaen and Ravallion (2007), An
> econometric method of correcting for unit nonresponse bias in surveys,
> J. of Econometrics 136.
>
> My sample includes 12000 responding households. I know their income,
> and which of 2500 regions they come from. In addition, for each
> region, I know the number of non-responding households. I find the
> coefficient on income by fitting estimated regional population to
> actual population:
>
> P_i = logit f(income_i,theta)
> actual_j = responding_j + nonresponding_j
> theta = argmin {sum(1/P_i) - actual_j}
>
> Response probability may not be monotonic in income. The logit may be
> a non-monotonic function of income.
>
> Thanks for any thoughts on how to estimate this in Stata, or how to
> make my 'trick' (setting 12000-2500 hhd-level residuals manually to
> zero) work better.
>
>
> On Sat, Jun 29, 2013 at 1:49 AM, Austin Nichols <austinnichols@gmail.com> wrote:
>> As the FAQ hints, if you don't provide full references, don't expect
>>
>> I don't understand your description--how are you running a logit of
>> response on income, when you only have income for responders?  Can you
>> give a sense of what the data looks like?
>>
>> On another topic, why would anyone expect response probability to be
>> monotonic in income?
>>
>> On Fri, Jun 28, 2013 at 10:05 AM, Vladimír Hlásny <vhlasny@gmail.com> wrote:
>>> Hi,
>>> I am using a method by Korinek, Mistiaen and Ravallion (2007) to
>>> correct for unit-nonresponse bias. That involves estimating
>>> response-probability for each household,  inferring regional
>>> population from these probabilities, and fitting against actual
>>> regional populations. I must use household-level data and region-level
>>> data simultaneously, because coefficients in the household-level model
>>> are adjusted based on fit of the regional-level populations.
>>>
>>> I used a trick - manually resetting residuals of all but
>>> one-per-region household - but this trick doesn't produce perfect
>>> results. Please find the details, remaining problems, as well as the
>>> Stata code described below. Any thoughts on this?
>>>
>>> Thank you for any suggestions!
>>>
>>> Ewha Womans University
>>> Seoul, Korea
>>>
>>> Details:
>>> I am estimating households' probability to respond to a survey as a
>>> function of their income. For each responding household (12000), I
>>> have data on income. Also, at the level of region (3000), I know the
>>> number of responding and non-responding households.
>>>
>>> I declare a logit equation of response-probability as a function of
>>> income, to estimate it for all responding households.
>>>
>>> The identification is provided by fitting of population in each
>>> region. For each responding household, I estimate their true mass as
>>> the inverse of their response probability. Then I sum the
>>> response-probabilities for all households in a region, and fit it
>>> against the true population.
>>>
>>> Stata problem:
>>> I am estimating GMM at the regional level. But, to obtain the
>>> population estimate in each region, I calculate response-probabilities
>>> at the household level and sum them up in a region. This region-level
>>> fitting and response-probability estimation occurs
>>> simultaneously/iteratively -- as logit-coefficients are adjusted to
>>> minimize region-level residuals, households response-probabilities
>>> change.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```