Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Comparing OLS regression coefficients across groups

 From Tim Wade To statalist@hsphsun2.harvard.edu Subject Re: st: Comparing OLS regression coefficients across groups Date Mon, 2 Jul 2012 11:31:03 -0400

```Regarding your examples a) and c):

Note that -suest- uses robust standard error estimates when combining
models so in order to get the same results for a) and c) you would
need to do:

regress mpg i.foreign##c.weight, robust

however,  I tried this and slightly different results are still
produced for a) and c)

I wonder if the discrepancy is due to the use of  -regress- and the
issue noted in the the help file for -suest-

"regress does not include its ancillary parameter, the residual
variance, in its coefficient vector and (co)variance matrix.
Moreover, while the score option is allowed with predict after
regress, a score variable is generated for the mean but not for the
variance parameter.  suest contains special code that assigns the
equation name mean to the coefficients for the mean, adds the equation
lnvar for the log variance, and computes the appropriate score
variables."

Results for logistic regression are exactly the same when the robust
standard error calculation is used:

sysuse auto.dta
xtile highmpg=mpg, nq(2)
logistic foreign i.highmpg##c.weight, robust
logistic foreign weight if highmpg==1
est store m1
logistic foreign weight if highmpg==2
est store m2
suest m1 m2
lincom [m1_foreign]weight - [m2_foreign]weight, or

. sysuse auto.dta, clear
(1978 Automobile Data)

. xtile highmpg=mpg, nq(2)

.
. logistic foreign i.highmpg##c.weight, robust

Logistic regression                               Number of obs   =         74
Wald chi2(3)    =      17.20
Prob > chi2     =     0.0006
Log pseudolikelihood = -26.661967                 Pseudo R2       =     0.4079

----------------------------------------------------------------------------------
|               Robust
foreign | Odds Ratio   Std. Err.      z    P>|z|     [95%
Conf. Interval]
-----------------+----------------------------------------------------------------
2.highmpg |   .0000242   .0001744    -1.48   0.140     1.83e-11
32.19213
weight |   .9940209   .0020784    -2.87   0.004     .9899557
.9981029
|
highmpg#c.weight |
2  |   1.002946   .0023943     1.23   0.218     .9982638
1.007649
|
_cons |   5.02e+07   3.33e+08     2.67   0.008     110.5712
2.28e+13
----------------------------------------------------------------------------------

.
. logistic foreign weight if highmpg==1

Logistic regression                               Number of obs   =         38
LR chi2(1)      =      14.12
Prob > chi2     =     0.0002
Log likelihood = -7.7378095                       Pseudo R2       =     0.4770

------------------------------------------------------------------------------
foreign | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight |   .9940209   .0023853    -2.50   0.012     .9893568     .998707
_cons |   5.02e+07   3.84e+08     2.32   0.020     15.46429    1.63e+14
------------------------------------------------------------------------------

.
. est store m1

.
. logistic foreign weight if highmpg==2

Logistic regression                               Number of obs   =         36
LR chi2(1)      =      11.95
Prob > chi2     =     0.0005
Log likelihood = -18.924158                       Pseudo R2       =     0.2399

------------------------------------------------------------------------------
foreign | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight |    .996949   .0011619    -2.62   0.009     .9946744    .9992288
_cons |    1216.39   3286.893     2.63   0.009     6.095119    242752.5
------------------------------------------------------------------------------

.
. est store m2

.
. suest m1 m2

Simultaneous results for m1, m2

Number of obs   =         74

------------------------------------------------------------------------------
|               Robust
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
m1_foreign   |
weight |   -.005997   .0020909    -2.87   0.004    -.0100951   -.0018989
_cons |   17.73103   6.645719     2.67   0.008      4.70566     30.7564
-------------+----------------------------------------------------------------
m2_foreign   |
weight |  -.0030557   .0011521    -2.65   0.008    -.0053137   -.0007976
_cons |   7.103643   2.753482     2.58   0.010     1.706917    12.50037
------------------------------------------------------------------------------

.
. lincom [m1_foreign]weight - [m2_foreign]weight, or

( 1)  [m1_foreign]weight - [m2_foreign]weight = 0

------------------------------------------------------------------------------
| Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |    .997063   .0023803    -1.23   0.218     .9924086    1.001739
------------------------------------------------------------------------------

On Wed, Jun 27, 2012 at 10:31 PM, TWC <twcmisc@gmail.com> wrote:
> Greetings to all,
>
> I need to compare regression coefficients across 2 groups to determine
> whether the effect for one group is significantly different from the
>
> a. Run a regression over all groups combined, adding the appropriate
> interaction terms which would indicate the difference and its
> significance.
>
> b. Similar to (a), but do not require the rvariance of the residual to
> be the same for both groups.  From
> http://www.stata.com/support/faqs/statistics/pooling-data-and-chow-tests/
>
> c. (suest) - Run regressions separately per group, storing each result
> separately then using suest and lincom to calculate the difference in
> coefficients.
>
> Two questions:
>
> 1.  The above methods produce different standard errors for the
> interaction term (or equivalent).  I gather that this is due to the
> differences in assumptions on the variance of the residual and sample
> size, but I am not sure exactly how.  And in cases (like the example
> below) where the significance hovers at the threshold, which std err /
> P-value is the appropriate values to be reported?
>
>    Method a: Std err = .0017846, P>|t| = 0.015
>    Method b: Std err = .0025643, P>|t| = 0.087
>    Method c: Std err = .0017067, P>|z| = 0.009   (test reports Prob >
> chi2(1) = 0.0091)
>
> 2. In the example below, the standard error for _b[weight] for
> Domestic cars also differ.  While it is significant in all methods,
> which std err should be reported?
>
>    Method a: Std err = .0006622, P>|t| = 0.000
>    Method b: Std err = .0004633, P>|t| = 0.000
>    Method c: Std err = .0004654, P>|t| = 0.000   (suest reports std
> err = .0005334, P>|z| = 0.000)
>
> Much thanks!
> Tian
>
> Truncated log below:
>
> . sysuse auto
>
> /************ Method a *********************/
>
> . regress mpg i.foreign##c.weight
>
> ----------------------------------------------------------------------------------
>              mpg |      Coef.   Std. Err.      t    P>|t|     [95%
> Conf. Interval]
> -----------------+----------------------------------------------------------------
>           weight |  -.0059751   .0006622    -9.02   0.000    -.0072958
>   -.0046544
>                  |
> foreign#c.weight |
>               1  |  -.0044509   .0017846    -2.49   0.015    -.0080101
>   -.0008916
> ----------------------------------------------------------------------------------
>
> . test 1.foreign#c.weight
>
>  ( 1)  1.foreign#c.weight = 0
>
>        F(  1,    70) =    6.22
>             Prob > F =    0.0150
>
> /************ Method b *********************/
>
> . gen g1 = (foreign==0)
> . gen g2 = (foreign==1)
> . gen g2weight = g2*weight
> . regress mpg weight g2 g2weight
> . predict r, resid
> . sum r if g1
> . gen w = r(Var) * (r(N)-1)/(r(N)-3) if g1
> . sum r if g2
> . replace w = r(Var)*(r(N)-1)/(r(N)-3) if g2
> . regress mpg weight g2 g2weight [aw=1/w]
>
> ------------------------------------------------------------------------------
>          mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>       weight |  -.0059751   .0004633   -12.90   0.000    -.0068992    -.005051
>
>     g2weight |  -.0044509   .0025643    -1.74   0.087    -.0095653    .0006636
> ------------------------------------------------------------------------------
>
> . test g2weight
>
>  ( 1)  g2weight = 0
>
>        F(  1,    70) =    3.01
>             Prob > F =    0.0870
>
>
> /************ Method c *********************/
>
> . regress mpg weight if foreign
> . estimates store Foreign
> . regress mpg weight if !foreign
>
> ------------------------------------------------------------------------------
>          mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>       weight |  -.0059751   .0004654   -12.84   0.000    -.0069098   -.0050403
> ------------------------------------------------------------------------------
>
> . estimates store Domestic
> . suest Domestic Foreign
>
> Simultaneous results for Domestic, Foreign
>
> --------------------------------------------------------------------------------
>                |               Robust
>                |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> ---------------+----------------------------------------------------------------
> Domestic_mean  |
>         weight |  -.0059751   .0005334   -11.20   0.000    -.0070205   -.0049297
> ---------------+----------------------------------------------------------------
>
> . lincom [Foreign_mean]weight - [Domestic_mean]weight
>
>  ( 1)  - [Domestic_mean]weight + [Foreign_mean]weight = 0
>
> ------------------------------------------------------------------------------
>              |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>          (1) |  -.0044509   .0017067    -2.61   0.009    -.0077959   -.0011059
> ------------------------------------------------------------------------------
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```