Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Goodness-of-fit tests after -gsem-

From	John Antonakis <[email protected]>
To	[email protected]
Subject	Re: st: Goodness-of-fit tests after -gsem-
Date	Sat, 13 Jul 2013 10:11:47 +0200

Hi Jenny:

At this time, and based on my asking the Tech. support people at Stata,the overidentification test (and here I mean the likelihood ratio test,or chi-square test) is not available for -gsem-, which is unfortunate,but understandable. This is only version 2 of -sem- and the program isreally very advanced as compared to other programs when they were onversion 2 (AMOS will is on version a zillion still can't do gsem, forexample). From what tech. support told me, it is on the wishlist andhopefully we will have a Yuan-Bentler style chi-square test for modelsestimated by gsem, like Mplus does.

As for assessing fit, you only need the chi-square test--indexes likeRMSEA or CFI don't help at all. I elaborate below on an edited versionof what I had written recently on SEMNET on this point (in particularsee the anecdote about Karl Joreskog, who as you may know, wasinstrumental in developing SEM, about why approximate fit indexes wereinvented):

"At the end of the day, science is self-correcting and with time, mostresearchers will gravitate towards some sort of consensus. I think thatwhat will prevail are methods that are analytically derived (e.g.,chi-square test and corrections to it for when it is not well behaved)and found to have support too via Monte Carlo. With respect to thelatter, what is funny--well ironic and hypocritical too--is thatmeasures of approximate fit are not analytically derived and the onlysupport that they have is via what I would characterize as weak MonteCarlo's--which in turn are often summary dismissed---by the very peoplewho use ignore the chi-square test--when the Monte Carlos provideevidence for the chi-square test.

We have the following issues that need to be correctly dealt with toensure the model passes the chi-square test (and also that inference iscorrect--i.e., with respect to standard errors):

1. low sample size to parameters estimated ratio (need to correct thechi-square)

2. non-multivariate normal data (need to correct the chi-square)
3. non-continuous measures (need to use appropriate estimator)

4. causal heterogeneity (need to control for sources of variance thatrender relations heterogenous)*

5. bad measures

6. incorrectly specified model (i.e., the causal structure reflectsreality and all threats to endogeneity are dealt with).

Any of these or a combination of these can make the chi-square testfail. Now, some researchers shrug, in a defeatist kind of way and say,"well I don't know why my model failed the chi-square test, but I willinterpret it in any case because the approximate fit indexes [like RMSEAor CFI] say it is OK." Unfortunately, the researcher will not know towhat extent these estimates may be misleading or completely wrong. And,reporting misleading estimates is, I think unethical and uneconomicalfor society. That is why all efforts should be made to develop measuresand find models that fit. At this time the best test we have is thechi-square test; we can also localize misfit via score tests ormodification indexes. I will rejoice the day we find better and strongertests; however, inventing weaker tests is not going to help us.

Again, here is a snippet from Cam McIntosh's (2012) recent paper on thispoint:

"A telling anecdote in this regard comes from Dag Sorböm, a long-timecollaborator of Karl Joreskög, one of the key pioneers of SEM andcreator of the LISREL software package. In recounting a LISREL workshopthat he jointly gave with Joreskög in 1985, Sorböm notes that: ‘‘In hislecture Karl would say that the Chi-square is all you really need. Oneparticipant then asked ‘Why have you then added GFI [goodness-of-fitindex]?’ Whereupon Karl answered ‘Well, users threaten us saying theywould stop using LISREL if it always produces such large Chi-squares. Sowe had to invent something to make people happy. GFI serves thatpurpose’ (p. 10)’’.

With respect to the causal heterogeneity point, according to Mulaik andJames (1995, p. 132), samples must be causally homogenous to ensure that‘‘the relations among their variable attributes are accounted for by thesame causal relations.’’ As we say in our causal claims paper (Antonakiset al, 2010), "causally homogenous samples are not infinite (thus, thereis a limit to how large the sample can be). Thus, finding sources ofpopulation heterogeneity and controlling for it will improve model fitwhether using multiple groups (moderator models) or multiple indicator,multiple causes (MIMIC) models" (p. 1103). This issues is something thatmany applied researchers fail to understand and completely ignore.


References:

*Antonakis J., Bendahan S., Jacquart P. & Lalive R. (2010). On makingcausal claims: A review and recommendations. The Leadership Quarterly,21(6), 1086-1120.

Bera, A. K., & Bilias, Y. (2001). Rao's score, Neyman's C(α) andSilvey's LM tests: an essay on historical developments and some newresults. Journal of Statistical Planning and Inference, 97(1), 9-44.

*Bollen, K. A. 1989. Structural equations with latent variables. NewYork: Wiley.

*James, L. R., Mulaik, S. A., & Brett, J. M. 1982. Causal Analysis:Assumptions, Models, and Data. Beverly Hills: Sage Publications.

*Joreskog, K. G., & Goldberger, A. S. 1975. Estimation of a model withmultiple indicators and multiple causes of a single latent variable.Journal of the American Statistical Association, 70(351): 631-639.

McIntosh, C. (2012). Improving the evaluation of model fit inconfirmatory factor analysis: A commentary on Gundy, C.M., Fayers, P.M.,Groenvold, M., Petersen, M. Aa., Scott, N.W., Sprangers, M.A.J.,Velikov, G., Aaronson, N.K. (2011). Comparing higher-order models forthe EORTC QLQ-C30. Quality of Life Research. Quality of Life Research,21(9), 1619-1621.

*Muthén, B. O. 1989. Latent variable modeling in heterogenouspopulations. Psychometrika, 54(4): 557-585.

*Mulaik, S. A. & James, L. R. 1995. Objectivity and reasoning in scienceand structural equation modeling. In R. H. Hoyle (Ed.), StructuralEquation Modeling: Concepts, Issues, and Applications: 118-137. ThousandOaks, CA: Sage Publications.

And, here are some examples from my work where the chi-square test waspassed (and the first study had a rather large sample)--so I don't livein a theoretical statistical bubble:


http://dx.doi.org/10.1177/0149206311436080
http://dx.doi.org/10.1016/j.paid.2010.10.010

Best,
J.

P.S. Take a look at the following posts too by me on these points onStatalist.


http://www.stata.com/statalist/archive/2013-04/msg00733.html
http://www.stata.com/statalist/archive/2013-04/msg00747.html
http://www.stata.com/statalist/archive/2013-04/msg00765.html
http://www.stata.com/statalist/archive/2013-04/msg00767.html

__________________________________________

John Antonakis
Professor of Organizational Behavior
Director, Ph.D. Program in Management

Faculty of Business and Economics
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland
Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305
http://www.hec.unil.ch/people/jantonakis

Associate Editor
The Leadership Quarterly
__________________________________________

On 13.07.2013 08:41, Bloomberg Jenny wrote:
> Hello,
>
> I have a question about goodness-of-fit tests with gsem. (I don't have
> any specific models in mind; it's a general question.)
>
> I'm now reading the Stata 13 manual, and noticed that postestimation
> commands such as -estat gof-, -estat ggof-, and -estat eqgof- can only
> be used after -sem-, and not after -gsem-. This means that
> goodness-of-fit statistics like RMSEA cannot be obtained when you use
> gsem.
>
> Then, how can I test goodness-of-fit if I use -gsem- to analyse a
> non-linear, generalized SEM with latent variables?
>
> I know that AIC and BIC are still available after -gsem- (by -ic-
> option), but they are not for judging fit in absolute terms but for
> comparing the fit of different models. What I'd like to know is if
> there are any practical ways to judge the goodness-of-fit of the model
> in absolute terms.
>
> Any suggestions will be greatly appreciated.
>
>
> Best,
> Jenny
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Goodness-of-fit tests after -gsem-
  - From: Bloomberg Jenny <[email protected]>

References:
- st: Goodness-of-fit tests after -gsem-
  - From: Bloomberg Jenny <[email protected]>

Prev by Date: st: Goodness-of-fit tests after -gsem-
Next by Date: st: Re: Some problems concerning nlsur！
Previous by thread: st: Goodness-of-fit tests after -gsem-
Next by thread: Re: st: Goodness-of-fit tests after -gsem-
Index(es):
- Date
- Thread