# RE: st: Strange -robust- results with a dummy variable

 From "Liu Yu" <[email protected]> To <[email protected]> Subject RE: st: Strange -robust- results with a dummy variable Date Sat, 22 Jan 2011 14:30:10 -0700

```Dear Michael,

Yes. I guess that is my problem. Your suggestion and the materials from the
link you mentioned http://www.ats.ucla.edu/stat/stata/library/homvar.htm
really helps. Thank you very much.

Catherine Liu

Dear Catherine

My hunch is that you have a combination of very unequal Ns for the two
levels of the
dummy variable -d-, combined with very unequal variances for the two
different groups. If
you think of this, for the moment, like a t-test (or like an ANOVA), this
would be
described as violating the homogeneity of variance assumption. This issue is
discussed on
the web page

http://www.ats.ucla.edu/stat/stata/library/homvar.htm

in the context of an ANOVA framework. As noted on that page, the "robust"
option provides
more appropriate p values in such a case.

To check to see if this is the case, I would suggest trying this command

tabstat y, by(d) stat(mean sd n)

which will show the mean, sd, and n for y by d (we are temporarily
ignoring x, for
simplicity).

I hope this helps,

Michael N. Mitchell
On 2011-01-21 3.34 PM, Liu Yu wrote:
> Dear Statalist.
>
> I have got a weird result when I run the following two regressions. (In
the
> following regressions, y is a daily stock return data from 1990 to 2010, x
> is the daily market return data for the same period, and d is a dummy
> variable which equals 1 on Nov-10-2001 and 0 otherwise.)
>
> The first is a simple OLS regression:
>
> . reg y x d
>
>
----------------------------------------------------------------------------
> --
>             y |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
> Interval]
>
-------------+--------------------------------------------------------------
> --
>             x |   .0237359     .03177     0.75   0.455    -.0385487
> .0860204
>             d |  -.0074946    .025867    -0.29   0.772    -.0582064
> .0432172
>         _cons |   .0007387   .0003825     1.93   0.054    -.0000112
> .0014886
>
> The second equals the first regression plus the "robust" option:
>
> . reg y x d, robust
>
>
----------------------------------------------------------------------------
> --
>               |               Robust
>             y |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
> Interval]
>
-------------+--------------------------------------------------------------
> --
>             x |   .0237359   .0304741     0.78   0.436    -.0360082
> .0834799
>             d |  -.0074946    .000539   -13.90   0.000    -.0085514
> -.0064378
>         _cons |   .0007387   .0003834     1.93   0.054    -.0000131
> .0014904
>
----------------------------------------------------------------------------
> --
>
> I am quite surprised by the fact that the standard error of d has
decreased
> significantly after I use the robust option, and its t-statistics changes
> from non-significant to significant. Should I trust the results from the
> second regression? Is there something special that I need to pay attention
> about the dummy variable and the robust option?
>
> Thank you all.
>
> Catherine Liu
>
```