Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: using subgroup regression coefficients in further regressions


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: Re: st: using subgroup regression coefficients in further regressions
Date   Mon, 23 Jul 2012 00:48:07 +0100

I find it difficult to tell what you want just from your code, and
can't follow what you want to do with or without a treatment dummy,
but

replace yhat = _b[cons] + _b[X1] * X1 + _b[X2] * X2

following a -regress y X1 X2- is unnecessary as that is what

predict yhat

does.  In practice you would need to do something more like this.

 qui forvalues i = `r(min)'/`r(max)' {
         reg y X1 X2 if subgroupvar==`i'
         predict work, residual
         replace res = work if subgroupvar==`i'
         drop work
         predict work
         replace yhat = work if subgroupvar==`i'
         drop work
}

On Sun, Jul 22, 2012 at 7:16 PM, Peter Hofmann <maxl@sunrise.ch> wrote:
> Thanks for the systematic of how to proceed, I think I am close
> to the solution by including
> - g yhat -    and
> - replace yhat... -    and
> - treatment_dummy != 1  (no treats included in the reg):
>
> g res = .
> g yhat=.
> sum subgroupvar, meanonly
>
> qui forvalues i = `r(min)'/`r(max)' {
>        reg y X1 X2 if subgroupvar==`i' &         treatment_dummy != 1
>  //reg only control group
>        predict work, residual
>        replace res = work if subgroupvar==`i'
>        drop work
>    replace yhat = _b[cons] + _b[X1] * X1 + _b[X2] * X2  if subgroupvar==`i'
> }
>
> g y = res + yhat   //should give the original dep. var. except for
> treatments, right ??
>
> But in my results it seems like the treats are also included in the
> reg, so perhaps
> & treatment_dummy != 1 did not work....?
>
> The treatments should not be in the same regression as the control group,
> since this is quite small and treatments would influence the outcome.
> Truly, the constant has to be included.
>
> Thank you
> Peter
>
>
>
> On Sun, Jul 22, 2012, Nick Cox <njcoxstata@gmail.com> wrote:
>
> This looks like a different question to me, but the principles are the same.
>
> 0. Initialise a variable to hold results outside the loop
>
> 1. After each regression, you use its estimation results. What you
> want may be most easily calculated in terms of something like
>
> _b[X1] * X1 + _b[X2] * X2
>
> 2. Typically you will -replace- results of the variable created in 0
> for some observations only.
>
> However, I don't understand how this differs from a problem best
> handled by -predict- or why no constant (intercept) appears in your
> expressions.
>
> Nick
>
> On Sun, Jul 22, 2012 at 3:42 PM, Peter Hofmann <maxl@sunrise.ch> wrote:
>> Thank you for the fast reply, Nick. Your hint improves my first step, however
>> the original question is still unanswered. Obviously I did not pose the
>> question clear enough, so I try again:
>>
>> After the regression I want to use the estimated coefficients (betas) of each
>> subgroup (control groups) to calculate the y-hat (=expected dependent
>> variable) of the
>> treatment observations (each corresponding to its specific subgroup) by:
>>
>> treatment1:
>> yhat_1 = beta-hat1 * X1 + beta-hat2 * X2
>>
>> treatment2:
>> yhat_2 = beta-hat3 * X3 + beta-hat4 * X4
>> .....
>>
>> The calculated y-hats of the treatments can now be compared to the real y's
>> of the treatments.
>>
>> Any help is appreciated!
>> Peter
>
> 2012/7/22 Peter Hofmann <maxl@sunrise.ch>:
>> Thank you for the fast reply, Nick. Your hint improves my first step, however
>> the original question is still unanswered. Obviously I did not pose the
>> question clear enough, so I try again:
>>
>> After the regression I want to use the estimated coefficients (betas) of each
>> subgroup (control groups) to calculate the y-hat (=expected dependent
>> variable) of the
>> treatment observations (each corresponding to its specific subgroup) by:
>>
>> treatment1:
>> yhat_1 = beta-hat1 * X1 + beta-hat2 * X2
>>
>> treatment2:
>> yhat_2 = beta-hat3 * X3 + beta-hat4 * X4
>> .....
>>
>> The calculated y-hats of the treatments can now be compared to the real y's
>> of the treatments.
>>
>> Any help is appreciated!
>> Peter
>>
>>
>> On Thu, Jul 19, 2012 at 2:00 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>
>>> That code won't work at all. Apart from some fantasy syntax, the
>>> second time around the loop the -generate- would fail as the variable
>>> already exists.
>>
>>> But as you want residuals, you can get them directly:
>>
>>> gen res = .
>>> sum subgroupvar, meanonly
>>
>>> qui forvalues i = `r(min)'/`r(max)' {
>>>        reg y x1 x2 if subgroupvar==`i'
>>>        predict work, residual
>>>         replace res = work if subgroupvar==`i'
>>>         drop work
>>> }
>>
>>> Note, if only as a style point, that putting returned results into
>>> scalars, and then scalars into locals, is in this case two more steps
>>> than needed.
>>
>>
>>
>> On Thu, Jul 19, 2012 at 1:12 PM, Peter Hofmann <maxl@sunrise.ch> wrote:
>>> Dear all,
>>>
>>> Currently I use one regression for each subgroup of my control sample
>>> and save the subgroup-betas.
>>> Now I want to use the respective betas for a regression on the
>>> treatment observations that correspond to the respective subgroup (to
>>> extract the residuals from these regressions with the treatment
>>> values).
>>>
>>> Currently I use:
>>> . sum subgroupvar
>>> . scalar min1=r(min)
>>> . local j=min1
>>> . scalar max1=r(max)
>>> . local k=max1
>>> . forvalues i=`j'(1)`k' {
>>> . reg y x1 x2 if subgroupvar==`i'
>>> . mat bhat = e(b)
>>> . svmat bhat, names(bhat_`i'_)
>>> . }
>>>
>>> But now I do not know how to proceed:
>>> I want to use the respective subgroup betas in a regression on the
>>> treatment observations (treatments are indicated by a dummy).
>>>
>>> I supposed it should look similar to:
>>> . forvalues i=`j'(1)`k' {
>>> . g yhat = `bhat_*_1' * var1 + `bhat_*_2' * var2    if subgroupvar==`i'
>>> . }
>>> But that results in:
>>> . + invalid name
>>> . r(198);
>>>
>>> I appreciate any help...
>>> Peter
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index