Stata | FAQ: Estimating robust standard errors in Stata

Home / Resources & support / FAQs / Estimating robust standard errors in Stata

Note: This FAQ is for users of releases prior to Stata 6. It is not relevant for more recent versions.

Why don’t the old huber results match the new robust versions?

Title		Estimating robust standard errors in Stata
Author		James Hardin, StataCorp

The new versions are better (less biased).

In the new implementation of the robust estimate of variance, Stata is now scaling the estimated variance matrix in order to make it less biased.

Unclustered data

Estimating robust standard errors in Stata 4.0 resulted in

 . hreg price weight displ
 
 Regression with Huber standard errors               Number of obs    =      74
                                                     R-squared        =  0.2909
                                                     Adj R-squared    =  0.2710
                                                     Root MSE         = 2518.38
 
 ------------------------------------------------------------------------------
    price |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
 ---------+--------------------------------------------------------------------
   weight |   1.823366   .7648832      2.384   0.020       .2982323      3.3485
    displ |   2.087054   7.284658      0.286   0.775      -12.43814    16.61225
    _cons |    247.907   1106.467      0.224   0.823      -1958.326     2454.14
 ------------------------------------------------------------------------------

and the same model in Stata 5.0 is

 . regress price weight displ, robust
    
 Regression with robust standard errors                 Number of obs =      74
                                                        F(  2,    71) =   14.44
                                                        Prob > F      =  0.0000
                                                        R-squared     =  0.2909
                                                        Root MSE      =  2518.4
 
 ------------------------------------------------------------------------------
          |               Robust
    price |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
 ---------+--------------------------------------------------------------------
   weight |   1.823366   .7808755      2.335   0.022       .2663446    3.380387
    displ |   2.087054   7.436967      0.281   0.780      -12.74184    16.91595
    _cons |    247.907   1129.602      0.219   0.827      -2004.454    2500.269
 ------------------------------------------------------------------------------

Stata 5.0 scales the variance matrix using

         n  
       -----
       n - k

for the (unclustered) regression results. To match the previous results, we can undo that scaling

 . di .7808755*sqrt(71/74)
 .76488318
 
 . di 7.436967*sqrt(71/74)
 7.284658
 
 . di 1129.602*sqrt(71/74)
 1106.4678

Clustered data

Running a robust regression in Stata 4.0 results in

 . hreg price weight displ, group(rep78)
 
 Regression with Huber standard errors               Number of obs    =      69
                                                     R-squared        =  0.3108
                                                     Adj R-squared    =  0.2899
                                                     Root MSE         = 2454.21
 Grouping variable: rep78
 ------------------------------------------------------------------------------
    price |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
 ---------+--------------------------------------------------------------------
   weight |   1.039647   .8439705      1.232   0.222      -.6453948    2.724688
    displ |   8.887734   7.450619      1.193   0.237      -5.987907    23.76337
    _cons |   1234.034   1986.931      0.621   0.537      -2733.002    5201.069
 ------------------------------------------------------------------------------

The same model run in Stata 5.0 results in

 .  regress price weight displ, robust cluster(rep78)
 
 Regression with robust standard errors                 Number of obs =      69
                                                        F(  2,     4) =    3.40
                                                        Prob > F      =  0.1372
                                                        R-squared     =  0.3108
 Number of clusters (rep78) = 5                         Root MSE      =  2454.2
 
 ------------------------------------------------------------------------------
          |               Robust
    price |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
 ---------+--------------------------------------------------------------------
   weight |   1.039647   .9577778      1.085   0.339      -1.619571    3.698864
    displ |   8.887734   8.455317      1.051   0.353      -14.58799    32.36346
    _cons |   1234.034   2254.864      0.547   0.613      -5026.472    7494.539
 ------------------------------------------------------------------------------

To match the previous results, the scale factor for clustered data is

        n - 1         g  
        -----   x   -----
        n - k       g - 1

so that if we wish to match the previous results we may

 . di .9577778*sqrt(4/5)*sqrt(66/68)
 .84397051
 
 . di 8.455317*sqrt(4/5)*sqrt(66/68)
 7.4506198
 
 . di 2254.864*sqrt(4/5)*sqrt(66/68)
 1986.9313

Note also that Stata 5.0 includes an F test in the header of the output that is the Wald test based on the robust variance estimate.

There is one final important difference. The hreg command used n-1 as the degrees of freedom for the t tests of the coefficients. This is anticonservative as Stata 5.0 now uses g-1 as the degrees of freedom. The more conservative definition of the degrees of freedom provides much more accurate confidence intervals. So for a dataset with a small number of groups (clusters) and a large number of observations, the difference between regress, robust cluster() and the old hreg will show up in the p-values of the t-statistics as the scale factor will become much less important, but the difference in degrees of freedom will remain important.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Why don’t the old huber results match the new robust versions?

Unclustered data

Clustered data

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Why don’t the old huber results match the new robust versions?

Unclustered data

Clustered data

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies