 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

 From kmacdonald@stata.com (Kristin MacDonald, StataCorp LP) To statalist@hsphsun2.harvard.edu Subject Re: st: More re factor loadings Date Tue, 01 Oct 2013 08:31:27 -0500

```Dave Garson <garson@ncsu.edu> asks how to obtain the coefficients that
SPSS refers to as "factor score weights" and that SAS labels "latent variable
scores regression coefficients".

Fist, let me discuss the terminology we use in our documentation.  I
recognize that different groups may call coefficients by different names, so I
want to make sure that there is no confusion.  When we use the term "factor
loading", we are referring to the coefficients on paths from latent variables
to observed variables.  These may be in the standardized or unstandardized
metric.

I believe that Dave would instead like the coefficients that can be used to
create a linear combination of the observed variables corresponding to the
predicted value of the latent variable.  In Stata, we call these "scoring
coefficients" in the '[MV] factor postestimation' manual entry where we
discuss predictions of factors with exploratory factor analysis.

There is not an option to automatically obtain a matrix of regression scoring
coefficients after fitting a model with -sem-.  However, if Dave is interested
in obtaining the predicted factor scores, he can use the -predict, latent-
command.  For example,

webuse sem_1fmm, clear
sem (X -> x1 x2 x3 x4)
predict xpred, latent(X)

This creates a new variable, xpred, containing the predicted value of X.

If Dave is interested in the actual coefficients used in the linear
combination that produces these predictions, he can create them manually using
the matrices returned by -estat framework- after -sem-.  In the case of a
standard CFA model, the coefficients are a function of the -r(Sigma)- matrix.
These coefficients are applied to the observed variables after they have been
centered.  The -r(mu)- matrix contains the means of each variable which we can
use to center the observed variables.  The code below demonstrates how to
predict the value of the latent variable X manually, for the above model:

estat framework, fitted
mat mu = r(mu)
mat sigma = r(Sigma)
mat sigma_zz = sigma[1..4,1..4]
mat inv_sigma_zz = syminv(sigma_zz)
mat sigma_zl = sigma[5,1..4]

mat scoef = inv_sigma_zz*sigma_zl'
mat list scoef

forvalues i = 1/4 {
gen x`i'_cent = x`i' - mu[1,`i']
}

gen mypred = scoef[1,1]*x1_cent + scoef[2,1]*x2_cent + ///
scoef[3,1]*x3_cent + scoef[4,1]*x4_cent

list xpred mypred in 1/10

The coefficients are stored in the scoef matrix and are then used to predict
the value of X in a new variable called mypred.  These are equivalent to the
values produced by the -predict- command above.  The output for the full set
of commands is given below my signature.

More complicated models containing structural paths not included in a CFA
model will require more matrix calculations that involve the fitted structural
path coefficients.

--Kristin
kmacdonald@stata.com

. use sem_1fmm, clear
(single-factor measurement model)

. sem (X -> x1 x2 x3 x4)

Endogenous variables

Measurement:  x1 x2 x3 x4

Exogenous variables

Latent:       X

Fitting target model:

Iteration 0:   log likelihood = -2081.0258
Iteration 1:   log likelihood =  -2080.986
Iteration 2:   log likelihood = -2080.9859

Structural equation model                       Number of obs      =       123
Estimation method  = ml
Log likelihood     = -2080.9859

( 1)  [x1]X = 1
------------------------------------------------------------------------------
|                 OIM
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Measurement  |
x1 <-      |
X |          1  (constrained)
_cons |   96.28455   1.271963    75.70   0.000     93.79155    98.77755
-----------+----------------------------------------------------------------
x2 <-      |
X |   1.172364   .1231777     9.52   0.000     .9309398    1.413788
_cons |   97.28455   1.450053    67.09   0.000      94.4425    100.1266
-----------+----------------------------------------------------------------
x3 <-      |
X |   1.034523   .1160558     8.91   0.000     .8070579    1.261988
_cons |   97.09756   1.356161    71.60   0.000     94.43953    99.75559
-----------+----------------------------------------------------------------
x4 <-      |
X |   6.886044   .6030898    11.42   0.000     5.704009    8.068078
_cons |   690.9837   6.960137    99.28   0.000     677.3421    704.6254
-------------+----------------------------------------------------------------
var(e.x1)|   80.79361   11.66414                      60.88206    107.2172
var(e.x2)|   96.15861   13.93945                      72.37612    127.7559
var(e.x3)|   99.70874   14.33299                      75.22708    132.1576
var(e.x4)|   353.4711   236.6847                      95.14548    1313.166
var(X)|   118.2068   23.82631                      79.62878    175.4747
------------------------------------------------------------------------------
LR test of model vs. saturated: chi2(2)   =      1.78, Prob > chi2 = 0.4111

. predict xpred, latent(X)

.
.
. estat framework, fitted

Endogenous variables on endogenous variables

| observed
Beta |        x1         x2         x3         x4
-------------+--------------------------------------------
observed     |
x1 |         0
x2 |         0          0
x3 |         0          0          0
x4 |         0          0          0          0
----------------------------------------------------------

Exogenous variables on endogenous variables

| latent
Gamma |         X
-------------+-----------
observed     |
x1 |         1
x2 |  1.172364
x3 |  1.034523
x4 |  6.886044
-------------------------

Covariances of error variables

| observed
Psi |      e.x1       e.x2       e.x3       e.x4
-------------+--------------------------------------------
observed     |
e.x1 |  80.79361
e.x2 |         0   96.15861
e.x3 |         0          0   99.70874
e.x4 |         0          0          0   353.4711
----------------------------------------------------------

Intercepts of endogenous variables

| observed
alpha |        x1         x2         x3         x4
-------------+--------------------------------------------
_cons |  96.28455   97.28455   97.09756   690.9837
----------------------------------------------------------

Covariances of exogenous variables

| latent
Phi |         X
-------------+-----------
latent       |
X |  118.2068
-------------------------

Means of exogenous variables

| latent
kappa |         X
-------------+-----------
mean |         0
-------------------------

Fitted covariances of observed and latent variables

| observed                                   | latent
Sigma |        x1         x2         x3         x4 |         X
-------------+--------------------------------------------+-----------
observed     |                                            |
x1 |  199.0004                                  |
x2 |  138.5813   258.6263                       |
x3 |  122.2876   143.3656   226.2181            |
x4 |  813.9769   954.2769   842.0779   5958.551 |
-------------+--------------------------------------------+-----------
latent       |                                            |
X |  118.2068   138.5813   122.2876   813.9769 |  118.2068
----------------------------------------------------------------------

Fitted means of observed and latent variables

| observed                                   | latent
mu |        x1         x2         x3         x4 |         X
-------------+--------------------------------------------+-----------
mu |  96.28455   97.28455   97.09756   690.9837 |         0
----------------------------------------------------------------------

. mat mu = r(mu)

. mat sigma = r(Sigma)

. mat sigma_zz = sigma[1..4,1..4]

. mat inv_sigma_zz = syminv(sigma_zz)

. mat sigma_zl = sigma[5,1..4]

.
. mat scoef = inv_sigma_zz*sigma_zl'

. mat list scoef

scoef[4,1]
latent:
X
observed:x1  .06875754
observed:x2  .06772851
observed:x3  .05763739
observed:x4  .10822142

.
. forvalues i = 1/4 {
2.   gen x`i'_cent = x`i' - mu[1,`i']
3. }

.
. gen mypred = scoef[1,1]*x1_cent + scoef[2,1]*x2_cent + ///
>              scoef[3,1]*x3_cent + scoef[4,1]*x4_cent

. list xpred mypred in 1/10

+-----------------------+
|     xpred      mypred |
|-----------------------|
1. | -26.55233   -26.55233 |
2. |  11.92044    11.92044 |
3. |  8.319204    8.319203 |
4. |  -7.50836    -7.50836 |
5. |  -3.87875   -3.878749 |
|-----------------------|
6. |  .9258427    .9258427 |
7. | -4.445202   -4.445201 |
8. |  3.599469    3.599469 |
9. | -4.307086   -4.307086 |
10. |  6.506975    6.506975 |
+-----------------------+

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```