Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Model identification in Stata sem()


From   John Antonakis <John.Antonakis@unil.ch>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Model identification in Stata sem()
Date   Sat, 08 Dec 2012 10:40:01 +0100

On a basic level, and with respect to the number of variables one has in the model or the constraints one makes, one can also work out, by hand, whether the model is under-identified, just-identified, or over-identified (necessary but not sufficient).

e.g.,
set seed 100
set obs 1000
gen x = rnormal()
gen x1 = x + rnormal()
gen x2 = x + rnormal()
sem (X ->x1 x2)

In the case of a two indicator confirmatory factor analysis, the model is not identified. There are 3 elements--v(v+1)/2--in the variance-covariance matrix from the v=2 variables (which gives two variances and a covariance) and what is being estimated is:

1. 1 loading for one of the indicators (the other is constrained to 1)
2. 1 variance of the latent variable
3. 2 variances of the disturbances

3-4 = -1 (model undefined). This model could be just-identified though, by making constraints (e.g., that the loadings are tau-equivalent, and constraining the variance of X to unity):

sem (X ->x1@a x2@a), var(X@1)

Now, if we introduce a third variable (e.g., y, predicted by the latent variable), we have--6 elements in the variance-covariance matrix:

1. 1 loading for one of the indicators (the other is constrained to 1)
2. 1 variance of the latent variable
3. 3 variances of the disturbances
4. 1 structural coefficient

6-6=0 (the model is just-identified)

Thus, a model could always be identified by adding more variables or making constraints.

Of course, there are other issues related to identification with respect to checking for local identification as Jay suggested below, empirical underidentification, etc. For the latter see page 50:

Kenny, D. A. (1979). Correlation and causality. New York, Wiley-Interscience. Kenny has made this book freely available here: http://davidakenny.net/books.htm

See also McDonald, R. P. (1982). A note on the investigation of local and global identifiability. Psychometrika, 47, 101-103.
http://link.springer.com/article/10.1007%2FBF02293855

Best,
J.


__________________________________________

Prof. John Antonakis
Faculty of Business and Economics
Department of Organizational Behavior
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Switzerland
Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305
http://www.hec.unil.ch/people/jantonakis

Associate Editor
The Leadership Quarterly
__________________________________________

On 08.12.2012 07:26, JVerkuilen (Gmail) wrote:
On Fri, Dec 7, 2012 at 1:19 PM, William Buchanan
<william@williambuchanan.net> wrote:
Hi Robert,

On the slide (32) that you referenced, there may not be a "formal" warning in terms of any blaring error messages but the output that they show includes information (or more accurately a lack thereof) that would indicate problems with the model.  If you look at "chi2(-1)" and "Prob > chi2 = ." that serves as a subtle indication that the model is not identified.  Any time "." shows up in the output, it generally is an indication that there were problems fitting the model to the data and it should be investigated further.

There are no truly general tests of identification of a model. A
number of algebraic tests exist in many cases and I suspect that other
SEM packages are checking them. You can check local identification by
computing the Jacobian matrix and checking its rank, which must be
full. Bekker, Merckens and Wansbeek (1994) wrote a nice book on the
topic and there are some other nice articles around which I can dig up
references to if desired.

A while back someone posted an example of a model fit by -sem- (an
exploratory factor analysis) and I showed that it was unidentified.
The sign was that the standard errors were whack, so one of the best
signs is that the standard errors are massive compared to what you'd
expect. It's easiest to see this in a standardized solution, because
in that case the standard errors should be proportional to 1/sqrt(n).
If they are not, that's a sure sign that one or more parameters is
unidentified, either in the population or empirically.

http://www.stata.com/statalist/archive/2012-10/msg00525.html
http://www.stata.com/statalist/archive/2012-10/msg00526.html


Bekker, P., Merckens, A., Wansbeek, T. (1994). Identification,
Equivalent Models and Computer Algebra. Academic Press.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index