Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Strange -robust- results with a singleton dummy


From   "Mark Schaffer" <M.E.Schaffer@hw.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   st: Strange -robust- results with a singleton dummy
Date   Fri, 27 Jun 2003 16:10:07 +0100

Dear Statalist colleagues:

I've encountered (via David Stromberg) a peculiar feature of 
regression with heteroskedastic-robust SEs when using dummy 
variables.  

If a dummy variable takes the value of 1 for a single observation, 
and zeros for the rest, some strange things happen:

1. The robust SEs still look quite plausible.

2. The F-stat is reported as missing.  There is a hyperlink for the 
missing F-stat in the regression output (Stata v7) but it doesn't 
mention the singleton dummy as a possible explanation.

3. The robust var-cov matrix is not of full rank.  Invert it and one 
of the row/columns becomes all zeros (but not necessarily the one 
corresponding to the singleton dummy).

Does anybody have any ideas on how to interpret this?  Are the robust 
SEs usable anyway?  Is the robust var-cov matrix still usable?

I should note that singleton dummies are not so unusual.  For 
example, one longstanding recommendation for dealing with an outlier 
is to create a dummy for it.  It would seem that this recommendation 
isn't compatible with using robust SEs at the same time.

A demonstration with the infamous auto.dta follows.

--Mark

. use d:\stata\auto, replace
(1978 Automobile Data)

. 
. gen singledummy=0

. replace singledummy=1 if _n==1
(1 real change made)

<Standard regression, no robust, nothing unusual>

. 
. regress weight length singledummy

      Source |       SS       df       MS              Number of obs =      74
-------------+------------------------------           F(  2,    71) =  302.43
       Model |  39461973.9     2  19730986.9           Prob > F      =  0.0000
    Residual |  4632204.50    71   65242.317           R-squared     =  0.8949
-------------+------------------------------           Adj R-squared =  0.8920
       Total |  44094178.4    73  604029.841           Root MSE      =  255.43

------------------------------------------------------------------------------
      weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      length |   33.01849   1.342694    24.59   0.000     30.34123    35.69575
 singledummy |  -26.00488   257.1827    -0.10   0.920    -538.8127    486.8029
       _cons |  -3185.434   254.1358   -12.53   0.000    -3692.167   -2678.702
------------------------------------------------------------------------------

<Var-cov matrix is full rank>

. mat Vinv=syminv(e(V))

. mat list Vinv

symmetric Vinv[3,3]
                  length  singledummy        _cons
     length    40.614269
singledummy    .00285091    .00001533
      _cons     .2131592    .00001533    .00113423


<Same regression with robust, and strange things happen>

. regress weight length singledummy, robust

<SEs look similar to non-robust above, but F-stat is missing>

Regression with robust standard errors                 Number of obs =      74
                                                       F(  1,    71) =       .
                                                       Prob > F      =       .
                                                       R-squared     =  0.8949
                                                       Root MSE      =  255.43

------------------------------------------------------------------------------
             |               Robust
      weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      length |   33.01849   1.279353    25.81   0.000     30.46753    35.56945
 singledummy |  -26.00488   30.18275    -0.86   0.392    -86.18757    34.17782
       _cons |  -3185.434   242.0935   -13.16   0.000    -3668.155   -2702.713
------------------------------------------------------------------------------

<Var-cov matrix isn't full rank>

. mat Vinv=syminv(e(V))

. mat list Vinv

symmetric Vinv[3,3]
                  length  singledummy        _cons
     length            0
singledummy            0    .00114255
      _cons            0    .00002822    .00001776


Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS  UK
44-131-451-3494 direct
44-131-451-3008 fax
44-131-451-3485 CERT administrator
http://www.som.hw.ac.uk/cert
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index