Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Vladimír Hlásny <vhlasny@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: GMM minimization of regional errors imputed from hhd level model |

Date |
Sun, 30 Jun 2013 11:10:39 +0900 |

Dear Austin: The model is definitely identified. Matlab runs the model well, because I can use household-level and region-level variables simultaneously. My trick in Stata also works, except that it produces imprecise results and occasionally fails to converge. (My current trick is to make Stata think that the model is at the household level, and manually setting all-but-one-per-region hhd-level residuals to zero.) Incomes of the responding households are my instrument. Essentially, because each region has a different survey-response-rate and different distribution of incomes of responding households, GMM estimates the relationship between households' response-probability and their income (subject to assumptions on representativeness of responding households). In sum: I need Stata to use region-level and household-level variables (or matrices) simultaneously. Specifically, Stata must minimize region-level residuals computed from a household-level logistic equation. E.g., if I feed household-level data into the GMM function-evaluator program, can I instruct the GMM to use only one residual per region? Vladimir On Sat, Jun 29, 2013 at 10:27 PM, Austin Nichols <austinnichols@gmail.com> wrote: > Vladimír Hlásny <vhlasny@gmail.com>: > I have not read the ref. But you do not really have instruments. That > is, you are not setting E(Ze) to zero with e a residual from some > equation and Z your instrument; you do not have moments of that type. > Seems you should start with optimize() instead of -gmm-, as you are > just minimizing the sum of squared deviations from targets at the > region level. Or am I still misunderstanding this exercise? > > On Fri, Jun 28, 2013 at 10:08 PM, Vladimír Hlásny <vhlasny@gmail.com> wrote: >> Thanks for responding, Austin. >> >> The full reference is: Korinek, Mistiaen and Ravallion (2007), An >> econometric method of correcting for unit nonresponse bias in surveys, >> J. of Econometrics 136. >> >> My sample includes 12000 responding households. I know their income, >> and which of 2500 regions they come from. In addition, for each >> region, I know the number of non-responding households. I find the >> coefficient on income by fitting estimated regional population to >> actual population: >> >> P_i = logit f(income_i,theta) >> actual_j = responding_j + nonresponding_j >> theta = argmin {sum(1/P_i) - actual_j} >> >> Response probability may not be monotonic in income. The logit may be >> a non-monotonic function of income. >> >> Thanks for any thoughts on how to estimate this in Stata, or how to >> make my 'trick' (setting 12000-2500 hhd-level residuals manually to >> zero) work better. >> >> Vladimir >> >> On Sat, Jun 29, 2013 at 1:49 AM, Austin Nichols <austinnichols@gmail.com> wrote: >>> Vladimír Hlásny <vhlasny@gmail.com>: >>> As the FAQ hints, if you don't provide full references, don't expect >>> good answers. >>> >>> I don't understand your description--how are you running a logit of >>> response on income, when you only have income for responders? Can you >>> give a sense of what the data looks like? >>> >>> On another topic, why would anyone expect response probability to be >>> monotonic in income? >>> >>> On Fri, Jun 28, 2013 at 10:05 AM, Vladimír Hlásny <vhlasny@gmail.com> wrote: >>>> Hi, >>>> I am using a method by Korinek, Mistiaen and Ravallion (2007) to >>>> correct for unit-nonresponse bias. That involves estimating >>>> response-probability for each household, inferring regional >>>> population from these probabilities, and fitting against actual >>>> regional populations. I must use household-level data and region-level >>>> data simultaneously, because coefficients in the household-level model >>>> are adjusted based on fit of the regional-level populations. >>>> >>>> I used a trick - manually resetting residuals of all but >>>> one-per-region household - but this trick doesn't produce perfect >>>> results. Please find the details, remaining problems, as well as the >>>> Stata code described below. Any thoughts on this? >>>> >>>> Thank you for any suggestions! >>>> >>>> Vladimir Hlasny >>>> Ewha Womans University >>>> Seoul, Korea >>>> >>>> Details: >>>> I am estimating households' probability to respond to a survey as a >>>> function of their income. For each responding household (12000), I >>>> have data on income. Also, at the level of region (3000), I know the >>>> number of responding and non-responding households. >>>> >>>> I declare a logit equation of response-probability as a function of >>>> income, to estimate it for all responding households. >>>> >>>> The identification is provided by fitting of population in each >>>> region. For each responding household, I estimate their true mass as >>>> the inverse of their response probability. Then I sum the >>>> response-probabilities for all households in a region, and fit it >>>> against the true population. >>>> >>>> Stata problem: >>>> I am estimating GMM at the regional level. But, to obtain the >>>> population estimate in each region, I calculate response-probabilities >>>> at the household level and sum them up in a region. This region-level >>>> fitting and response-probability estimation occurs >>>> simultaneously/iteratively -- as logit-coefficients are adjusted to >>>> minimize region-level residuals, households response-probabilities >>>> change. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: GMM minimization of regional errors imputed from hhd level model***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: GMM minimization of regional errors imputed from hhd level model***From:*Vladimír Hlásny <vhlasny@gmail.com>

**Re: st: GMM minimization of regional errors imputed from hhd level model***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: GMM minimization of regional errors imputed from hhd level model***From:*Vladimír Hlásny <vhlasny@gmail.com>

**Re: st: GMM minimization of regional errors imputed from hhd level model***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**st: SEM: cannot correlate exogenous variable with endogenous variable** - Next by Date:
**st: problems with hetprob** - Previous by thread:
**Re: st: GMM minimization of regional errors imputed from hhd level model** - Next by thread:
**Re: st: GMM minimization of regional errors imputed from hhd level model** - Index(es):