Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: problem with generated regressands and WLS

 From Stas Kolenikov <[email protected]> To [email protected] Subject Re: st: problem with generated regressands and WLS Date Wed, 13 Oct 2010 21:46:01 -0500

```That's more or less the mainstream hierarchical linear model
extensively used in education. You might want to check say Raudenbush
& Bryk (2002) SAGE book. The methods described there do take into
account the sampling variability for both individual and
district-level estimates properly, and I already mentioned Stata
commands that implement these methods.

2010/10/13 Arka Roy Chaudhuri <[email protected]>:
> Thanks so much for the reply. I am not sure this is a hierarchical
> model.Actually what I am trying to analyze is the effect of trade
> reforms on the gender wage gap in a district over time.So my first
> stage regression is at the level of the individual, y is log wages,
> the z's are individual level factors like education,marital status and
> the x's are interactions between the male dummy and district dummies.I
> also have district dummies and a male dummy in the first stage. I take
> the betas(the coefficients on the interactions between district
> dummies and the male dummy) as estimates of the residual(i.e. the gap
> after correcting for factors like education) gender wage gap within a
> district. In the next step I regress this district level gender wage
> gap(beta) on the district level tariffs(the variable q in the
> equation) and estimate delta which is my main parameter of interest. I
> could have done the whole analysis at the individual level but I would
> like to get an estimate of the gender wage gap ie the betas.Thus my
> question is since my dependent variable in the second stage is an
> estimate and not taken directly from the data how should I correct for
> this fact?Thanks
>
> Arka
>
> 2010/10/13 Stas Kolenikov <[email protected]>:
>> Is this a multilevel model with interactions between levels? If yes,
>> you'd want to estimate it as such, probably in -gllamm- or -xtmixed-.
>> If not, you can still run this is the reduced form with all the
>> interactions spelled out as a regular regression, although you'd
>> probably want to correct for heteroskedasticity and/or clustering.
>>
>> 2010/10/13 Arka Roy Chaudhuri <[email protected]>:
>>> I shall repost my earlier mail(using full names for the Greek
>>> characters) as I just learnt that many might not be able to see the
>>> Greek characters.I am extremely sorry for my mistake and the
>>> inconvenience caused.
>>> I wrote:
>>>  Thanks for the response. Sorry for not making my notation clearer- I
>>> had used x for the independent variables in both the first and second
>>> stage.Revising my notation:
>>>
>>>  1st stage:
>>>  y = alpha  + beta1x1+ beta2x2 +................. +betanxn+ rho1z1 +  rho2z2 + u
>>>
>>>  2nd stage:
>>>  beta= p + deltaq + error
>>>
>>>  In the first stage y is the dependent variable and x1...xn, z1,z2 are
>>>  the independent variables, beta1-betan and rho1-rho2 are the parameters.alpha
>>>  and p are the intercepts in the first and  second stage respectively.
>>>  The beta's(beta1.....betan) from the first stage constitute my dependent
>>>  variable in the second stage-since there are n of them I have n
>>>  observations for my dependent variable in the second stage. q is the
>>>  independent variable in the second stage and delta is the parameter
>>> to be estimated. I also
>>>  have n observations  of q.
>>>  Yes I do want to improve efficiency although I am not sure how.
>>>  Should I use the entire variance-covariance matrix of the beta's from the
>>>  first stage as the weighing matrix in the second stage?Or should I
>>>  just use the variance(from the first stage) of the betas as analytic
>>>  weights in the second stage?If I use the second method should not
>>>  non-zero covariances across the observations(beta's) affect my
>>>  results?Also if I am to use the entire variance-covariance matrix as
>>>  the weighing matrix how should I implement it in Stata?Please
>>>
>>> Arka
>>>
>>> 2010/10/12 Arka Roy Chaudhuri <[email protected]>:
>>>> Thanks for the response. Sorry for not making my notation clearer- I
>>>> had used x for the independent variables in both the first and second
>>>> stage.Revising my notation:
>>>> 1st stage:
>>>> y = α + β1x1+ β2x2 +................. +βnxn+ ρ1z1 +  ρ2z2 + u
>>>>
>>>> 2nd stage:
>>>>  β= p + δq + ε
>>>>
>>>> In the first stage y is the dependent variable and x1...xn, z1,z2 are
>>>> the independent variables.α and p are the intercepts in the first and
>>>> second stage respectively.
>>>> The β's(β1, β2,......βn) from the first stage constitute my dependent
>>>> variable in the second stage-since there are n of them I have n
>>>> observations for my dependent variable in the second stage. q is the
>>>> independent variable in the second stage. I also have n observations
>>>> of them.
>>>>  Yes I do want to improve efficiency although I am not sure how.
>>>> Should I use the entire variance-covariance matrix of the β's from the
>>>> first stage as the weighing matrix in the second stage?Or should I
>>>> just use the variance(from the first stage) of the betas as analytic
>>>> weights in the second stage?If I use the second method shouldn't
>>>> non-zero covariances across the observations(β's) affect my
>>>> results?Also if I am to use the entire variance-covariance matrix as
>>>> the weighing matrix how should I implement it in Stata?Please
>>>>
>>>> Arka
>>>>
>>>> 2010/10/12 Austin Nichols <[email protected]>:
>>>>> Arka Roy Chaudhuri <[email protected]>:
>>>>> If you think beta is measured with an independent error, i.e. no
>>>>> endogeneity or other endemic problems, you can ignore the fact that it
>>>>> is generated; measurement error in the depvar is usually not a
>>>>> problem. But perhaps you are looking for improved efficiency, and you
>>>>> want to use the squared SE on beta as a measure of the error
>>>>> variance--but it does not vary by observation--see the manual entry on
>>>>> -vwls- for example.  Is your "second stage" in matrix form using the
>>>>> same y and x and so forth, or have you reused notation?
>>>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>>
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
>> Small print: I use this email account for mailing lists only.
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```