Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Clustered Standard Errors vs HLM for Small Sample Project

From	Austin Nichols <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Clustered Standard Errors vs HLM for Small Sample Project
Date	Mon, 18 Nov 2013 17:34:31 -0500
John Antonakis <[email protected]> :

The mix-up is evident in "I would first check for omitted
fixed-effects," but perhaps I misunderstood what that sentence was
intended to mean. I agree that before relying on a RE model one should
first ensure that RE is a consistent estimator using -xtoverid- (SSC).

Pooled OLS with CRSE does omit fixed effects, but -xtoverid- does not
compare the two approaches (RE v CRSE)--that was my point.  -xtoverid-
will compare FE and RE, with or without CRSE, and a comparison of
pooled regression with CRSE and RE is also possible, but a test for
fixed effects e.g. via the F test at the bottom of -xtreg, fe- ("F
test that all u_i=0" as Stata terms it) would be more in keeping with
"I would first check for omitted fixed-effects." But with FE, the
interpretation of coefs also can change, and one has to worry about
within and between variation in signal-to-noise (FE can sometimes
greatly amplify measurement error producing greater bias).  Note that
one could reject RE in favor of FE if pooled OLS is asymptotically
unbiased as the number of panels goes to infinity but the within
variation is due primarily to measurement error so FE is actually
inferior to RE.

-cltest- and -xtcltest- became -chatest- and -xtchatest- as mentioned
at http://www.stata.com/meeting/uk10/UKSUG10.Baum.pdf and for the
latest versions you can pester Mark Schaffer.

On Mon, Nov 18, 2013 at 4:18 PM, John Antonakis <[email protected]> wrote:
> Thanks for the clarifications Austin.
>
> In fact, I do not think I mixed up what you suggest I did. It seemed to me
> that the original question was whether MK should use "clustered standard
> errors or HLM".
>
> I assumed that:
>
> 1. "clustered standard errors" = pooled OLS with cluster-robust standard
> errors (I did not assume that MK was suggesting that this estimator was OLS
> with FE dummies, which case that OLS estimator is also potentially
> inconsistent if the FEs are omitted)
>
> 2. HLM = RE model.
>
> My suggestion was that before estimating a RE model MK should first ensure
> that the RE estimator is consistent (and  I suggested the xtoverid command
> for the Hausman test).
>
> For the cluster size, I am happy to see that you cite Kézdi regarding 50
> clusters is probably sufficient for valid inference. I did not pay attention
> to the unbalanced clusters issue because I was so focused on the RE problem;
> thanks for catching that.
>
> BTW, any idea what has happened to cltest and xtcltest (you cite on p. 23 of
> your presentation)?
>
> Best,
>
> J.
>
> __________________________________________
>
> John Antonakis
> Professor of Organizational Behavior
> Director, Ph.D. Program in Management
>
> Faculty of Business and Economics
> University of Lausanne
> Internef #618
> CH-1015 Lausanne-Dorigny
> Switzerland
> Tel ++41 (0)21 692-3438
> Fax ++41 (0)21 692-3305
> http://www.hec.unil.ch/people/jantonakis
>
> Associate Editor:
> The Leadership Quarterly
> Organizational Research Methods
> __________________________________________
>
> On 18.11.2013 21:19, Austin Nichols wrote:
>> For good inference, you want not only many clusters, but also clusters
>> that are balanced (which means guidelines about 20 or 30 or 42 or 50
>> clusters are less than helpful):
>> http://www.stata.com/meeting/13uk/nichols_crse.pdf
>>
>> When RE/HLM models and Cluster-Robust SE work well, they give similar
>> answers, but in some circumstances where they work poorly, they can
>> also give similar (wrong) answers:
>> https://appam.confex.com/appam/2013/webprogram/Paper6337.html
>>
>> You need to describe in more detail the source of correlations in
>> errors and regressors to get a good answer--on how to design a
>> simulation to indicate which approach is likely to give the best
>> inference in your setting.
>>
>> In his reply below, John Antonakis seems to be mixing up a comparison
>> between FE and RE (ssc describe xtoverid) with a comparison between FE
>> and pooled OLS with CRSE; whether or not you should use a fixed
>> effects method is a more complicated question than any one test will
>> answer, and depends very strongly on what you believe about
>> measurement error in your predictors.
>>
>> On Mon, Nov 18, 2013 at 10:26 AM, John Antonakis <[email protected]>
>> wrote:
>>> Hi:
>>>
>>> You should not use terms like "HLM" (which is a program in addition to an
>>> estimation method in some disciplines) without defining it (most here do
>>> not
>>> use this program but Stata obviously).
>>>
>>> I guess I know what you are after, that is, whether you should estimate a
>>> random-effects (multilevel model), versus a pooled model using OLS with a
>>> cluster-robust estimate of the variance--. Before you do anything, and if
>>> you have level 1 (i.e., within cluster varying predictors), then you
>>> should
>>> be much more worried about omitted fixed-effects than just about robust
>>> standard errors--which are important too. See:
>>>
>>> Halaby, C. N. 2004. Panel models in sociological research: Theory into
>>> practice. Annual Review of Sociology, 30: 507-544.
>>>
>>> So, I would first check for omitted fixed-effects. If the Haumsan
>>> endogeneity test (can be tested with the user written command -xtoverid-
>>> from SSC) is significant, it means that he restrictions that your
>>> regressors
>>> don't correlate with the uj (i.e., the fixed-effect error term) is
>>> rejected.
>>> Then you either must model the fixed effects either with dummies or using
>>> the Mundlak procedure:
>>>
>>> Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. 2010. On making
>>> causal claims: A review and recommendations. The Leadership Quarterly,
>>> 21(6): 1086-1120.
>>>
>>> Next, as for the number of clusters ideally you'll have between 30-50 for
>>> valid inference.
>>>
>>> Hth.
>>> J.
>>>
>>> __________________________________________
>>>
>>> John Antonakis
>>> Professor of Organizational Behavior
>>> Director, Ph.D. Program in Management
>>>
>>> Faculty of Business and Economics
>>> University of Lausanne
>>> Internef #618
>>> CH-1015 Lausanne-Dorigny
>>> Switzerland
>>> Tel ++41 (0)21 692-3438
>>> Fax ++41 (0)21 692-3305
>>> http://www.hec.unil.ch/people/jantonakis
>>>
>>> Associate Editor:
>>> The Leadership Quarterly
>>> Organizational Research Methods
>>> __________________________________________
>>>
>>>
>>> On 18.11.2013 03:06, [email protected] wrote:
>>>>
>>>> I'm using STATA 10 and I'm trying to figure out whether to use clustered
>>>> standard errors or HLM.I have 233 observations from agencies located in
>>>> 10
>>>> different states.
>>>>
>>>> The minimum number of observations I have from a state is 3 and the
>>>> maximum number of observations I have is 108 with an average
>>>> of 23.3. I'm not interested in state level differences, I'm only
>>>> interested in results from the agency level and I want to account for
>>>> the
>>>> fact that there may be some state level effects.
>>>>
>>>> The literature I've read so far doesn't seem to point me in any definite
>>>> direction. The literature seems to say that HLM works best on larger
>>>> datasets, but it also seems to say that you need at least 20 clusters
>>>> for
>>>> either method to be effective. Does anyone have a suggestion for which
>>>> of
>>>> these two methods I should use, or at least what I should consider in
>>>> making
>>>> my choice? Is there some other method I should use?
>>>>
>>>> Thank you in advance for your consideration.
>>>>
>>>> MK
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: st: Oaxaca Decomposition with Heckman Multinomial Probit
Next by Date: Re: st: How to interpret time dummies in simple difference in difference regression
Previous by thread: Re: st: Clustered Standard Errors vs HLM for Small Sample Project
Next by thread: st: Maps in Stata: how to match lat/long to shape file?
Index(es):
- Date
- Thread