# Re: st: Seemingly unrelated regression (SUR) test joint significance with clustered standard errors

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Seemingly unrelated regression (SUR) test joint significance with clustered standard errors Date Thu, 26 Feb 2009 12:42:20 -0500

```Eric Lewis <erikylewis@gmail.com>:
The "standard" linear discriminant approach gives an equally high
rejection rate; rejection is about 70% with a nominal size of 5%.
This is because you have 280 clusters and are testing 79 coefficients,
and apparently the ratio of 79 to 280 is too high, and your VCE
estimate is too small.  If you cut down the number of restrictions to
22 (19 var and 3 level dummies), you get a rejection rate of about 10%
with a nominal size of 5%.  Still too high, but more reasonable than
70%.  If you just test three level dummies, you get back to the 5%
rejection rate. Since you can't increase the number of clusters now
that you already have the data (remember the cluster-robust SE
estimator is asymptotically correct as the number of clusters
approaches infinity, not as the number of observations approaches
infinity), I suggest you cut down the number of coefficients tested,
or use another VCE estimator (or switch to a multi-level model with
parametric assumptions on errors).

cap program drop jointtest
program define jointtest, rclass
est clear
drop _all
set obs 6000
gen clusterid = ceil(_n*280/_N)
forvalues i = 1(1)19 {
gen var`i' = invnormal(uniform())
}
egen level = fill(1 2 3 4 1 2 3 4)
gen treat = (uniform() > .5)
foreach level in 1 2 3 4 {
foreach var of varlist var* {
regress `var' treat if level == `level'
est store e_`var'_`level'
}
}
suest e_*_* , vce(cluster clusterid)
testparm treat
return scalar chi = r(chi2)
return scalar df = r(df)
return scalar p = r(p)
loc xi
unab v: var*
foreach i of local v {
if "`xi'"=="" loc xi "`xi' i.level*`i'"
else loc xi "`xi' i.level|`i'"
}
xi: reg treat i.level, vce(cluster clusterid)
return scalar F0 = e(F)
return scalar ndf0 = e(df_m)
return scalar ddf0 = e(df_r)
return scalar pF0 = Ftail(e(df_m),e(df_r),e(F))
xi: reg treat var* i.level, vce(cluster clusterid)
return scalar F1 = e(F)
return scalar ndf1 = e(df_m)
return scalar ddf1 = e(df_r)
return scalar pF1 = Ftail(e(df_m),e(df_r),e(F))
xi: reg treat `xi', vce(cluster clusterid)
return scalar F2 = e(F)
return scalar ndf2 = e(df_m)
return scalar ddf2 = e(df_r)
return scalar pF2 = Ftail(e(df_m),e(df_r),e(F))
eret clear
end
simulate, reps(500) seed(4): jointtest
gen frac05 = (p < .05)
gen lindisc05 = (pF2 < .05)
gen nointrs05 = (pF1 < .05)
gen justlev05 = (pF0 < .05)
su *05

On Thu, Feb 26, 2009 at 10:32 AM, Austin Nichols
<austinnichols@gmail.com> wrote:
> Eric Lewis <erikylewis@gmail.com>:
> I don't see an obvious bug in the code, but it is not likely that "the
> -test- command is mis-programmed" as you surmise.  Much more likely
> that the cluster-robust SE estimator you are using (note that -suest-
> uses a form of cluster-robust SE estimation even in the absence of a
> vce option) is biased downward, leading to over-rejection of the null
> (a well-known if little appreciated feature of the cluster-robust SE
> estimator; see Rogers 1993) .    This tends to be more of a problem
> when testing a hypothesis that eats up more degrees of freedom (as
> found by Nichols and Schaffer 2007 in unpublished work).
>
> In any case, you should compare your -suest- method to the standard
> method of checking that treatment status is not correlated with
> baseline characteristics--which is a comparison of means via
> -hotelling- or an equivalent F-test in a regression of the treatment
> indicator on baseline characteristics.  For example, suppose south is
> the treatment indicator and you want to compare pre-treatment baseline
>
> sysuse nlsw88, clear
> qui reg south grade wage
> di e(F)
>
> That model is a regression of treat on var* in your case:
>
> hotelling var*, by(treat)
> reg treat var*
>
> which you can make cluster-robust:
>
> reg treat var*, vce(cluster clusterid)
>
> and to condition on level try something like:
>
> loc xi
> unab v: var*
> foreach i of local v {
>  if "`xi'"=="" loc xi "`xi' i.level*`i'"
>  else loc xi "`xi' i.level|`i'"
> }
> xi: reg treat `xi', vce(cluster clusterid)
>
> and let us know what the result is...
>
> Nichols and Schaffer. 2007. http://www.stata.com/meeting/13uk/abstracts.html
> Rogers. 1993. http://www.stata.com/support/faqs/stat/stb13_rogers.pdf
>
>
> On Wed, Feb 25, 2009 at 6:28 PM, Eric Lewis <erikylewis@gmail.com> wrote:
>> Hi,
>>
>> I was working on some analysis of an experiment where I am checking to
>> make sure that treatment status is not correlated to baseline
>> characteristics conditional on an exogenous category ("level") where
>> standard errors are clustered.  To get one single statistic, I was
>> combining variables using a SUR model.  I kept getting a rejection of
>> null hypothesis, and wondered if the test program is not written
>> correctly.  So I wrote a monte carlo simulation to check the quality
>> of the "test" command and it looks like indeed the "test" command is
>> mis-programmed.  The simulation code is posted below, and you can
>> check out the high fraction of p values that reject.  Simulations
>> without the cluster command seem to give a much more reasonable
>> distribution of p values.
>>
>> Does anyone know of some alternative way of testing joint significance
>> in SUR with clustered standard errors?
>> (Or perhaps there's a bug in my simulation code . . .)
>>
>> Thanks,
>>
>> Eric
>>
>> #delimit ;
>> cap program drop jointtest ;
>> program define jointtest, rclass ;
>>        #delimit ;
>>        est clear ;
>>        drop _all ;
>>        set obs 6000 ;
>>        gen clusterid = ceil(_n*280/_N) ;
>>        forvalues i = 1(1)19 { ;
>>                gen var`i' = invnormal(uniform()) ;
>>        } ;
>>        egen level = fill(1 2 3 4 1 2 3 4) ;
>>        gen uniform = uniform() ;
>>        gen treat = (uniform > .5);
>>        gen treat2 = (uniform >= .25 & uniform < .5) ;
>>        gen treat3 = (uniform >= .50 & uniform < .75) ;
>>        gen treat4 = (uniform >= .75) ;
>>
>>        foreach level in 1 2 3 4 { ;
>>                foreach var of varlist var* { ;
>>                        regress `var' treat if level == `level' ;
>>                        * regress `var' treat2 treat3 treat4 if level == `level' ;
>>                        est store e_`var'_`level' ;
>>                } ;
>>        } ;
>>
>>        suest e_*_* , vce(cluster clusterid);
>>        testparm treat;
>>        * testparm treat2 treat3 treat4;
>>                return scalar chi = r(chi2);
>>                return scalar df = r(df);
>>                return scalar p = r(p);
>> end;
>>
>> simulate chitest = r(chi) dftest = r(df) ptest = r(p), reps(100): jointtest;
>> tab ptest;
>> gen frac05 = (ptest < .05);
>> tab frac05;
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```