Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: deriving the BIC when the vce(robust) option is used

From	mario fiorini <[email protected]>
To	[email protected]
Subject	st: deriving the BIC when the vce(robust) option is used
Date	Thu, 29 Nov 2012 09:10:11 +1100

Dear statalist,
using Stata 11.2, I was trying to derive the Bayesian Information
Criterion (BIC) after a regression with the vce(robust) option, and
noted that the BIC is computed uisng the rank of e(V). However, the
rank of e(V) was lover than the number of coefficients. What I think
is happening is that I have a variable that is nonzero for only 1
observation in the estimation sample (I have a lot of dummy
variables). Stata is clear in what happens in this case. From Stata

"Is there a regressor that is nonzero for only 1 observation or for one cluster?

    The VCE you have just estimated is not of sufficient rank to perform the
    model test.  This can happen if there is a variable in your model that is
    nonzero for only 1 observation in the estimation sample.  Likewise, it
    can happen if a variable is nonzero for only one cluster when using the
    cluster-robust VCE.  In such cases the derivative of the sum-of-squares
    or likelihood function with respect to that variable's parameter is zero
    for all observations.  That implies that the outer-product-of-gradients
    (OPG) variance matrix is singular.  Because the OPG variance matrix is
    used in computing the robust variance matrix, the latter is therefore
    singular as well."

However, what surprised me was that the reported
1 - e(df_m) was not equal to the actual number of coefficients
2 - the BIC is determined using the rank of e(V) rather than the
actual number of coefficients

The code below replicates this situation, using in one case vce(ols)
[the default] and in the other vce(robust)

 * Start
clear
set obs 1000
ge id = _n
ge var2 = 0
replace var2=1 if id==1 // nonzero for only 1 observation
ge var3 = invnorm(uniform())
ge var4 = invnorm(uniform())

reg var3 var2 var4
ereturn list
estat ic

reg var3 var2 var4, vce(robust)
ereturn list
estat ic

 * Ends

The estimated coefficients are the same in both cases, while the
standard errors are not, as expected. However, note that the e(df_m)
and BIC are different depending on the vce option.
Is this correct? Shouldn't e(df_m) always report the actual number of
coefficients and the BIC be calculated accordingly?
Any clarification would be great.

Mario Fiorini
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: deriving the BIC when the vce(robust) option is used
  - From: Maarten Buis <[email protected]>

Prev by Date: st: Problem while generating limited predictor after spreg
Next by Date: st: Saving variable order to dofile
Previous by thread: st: working with CPT codes and converting to icd9 codes in stata
Next by thread: Re: st: deriving the BIC when the vce(robust) option is used
Index(es):
- Date
- Thread