|
|
Stata’s factor command allows you to fit common-factor models; see also principal components.
By default, factor produces estimates using the principal-factor method (communalities set to the squared multiple-correlation coefficients). Alternatively, factor can produce iterated principal-factor estimates (communalities re-estimated iteratively), principal-components factor estimates (communalities set to 1), or maximum-likelihood factor estimates.
After you fit a factor model, Stata allows you to rotate the factor-loading matrix using the varimax (orthogonal) and promax (oblique) methods. Stata can score a set of factor estimates using either rotated or unrotated loadings. Both regression and Bartlett scorings are available.
Below we fit a maximum-likelihood factor model on eight medical symptoms from a medical outcomes study (Tarlov et al. 1989) using three factors:
. factor joints-throat, ml factors(3) protect(5)
(obs=3046)
Likelihood verification 0, maximum = -21.8257
Likelihood verification 1, maximum = -21.8257
Likelihood verification 2, maximum = -21.8257
Likelihood verification 3, maximum = -18.4300
Likelihood verification 4, maximum = -21.8257
Likelihood verification 5, maximum = -18.4300
Differing maxima obtained.
Iteration 0: log likelihood =-1925.2187
Iteration 1: log likelihood =-40.623068
Iteration 2: log likelihood = -27.38831
Iteration 3: log likelihood =-26.291917
Iteration 4: log likelihood = -18.49983
Iteration 5: log likelihood = -18.43281
Iteration 6: log likelihood =-18.430164
Iteration 7: log likelihood =-18.429999
Iteration 8: log likelihood =-18.429988
Iteration 9: log likelihood =-18.429988
Iteration 10: log likelihood =-18.429988
(maximum likelihood factors; 3 factors retained)
Factor Variance Difference Proportion Cumulative
------------------------------------------------------------------
1 2.36049 1.64310 0.6892 0.6892
2 0.71739 0.37019 0.2095 0.8986
3 0.34720 . 0.1014 1.0000
Test: 3 vs. no factors. Chi2( 24) = 4718.59, Prob > chi2 = 0.0000
Test: 3 vs. more factors. Chi2( 7) = 36.79, Prob > chi2 = 0.0000
Factor Loadings
Variable | 1 2 3 Uniqueness
----------+-------------------------------------------
joints | 0.62749 -0.07856 0.26240 0.53124
cough | 0.29859 0.14908 0.05009 0.88611
backache | 0.82633 -0.33130 -0.11018 0.19530
nausea | 0.49540 0.49656 -0.25307 0.44396
indigest | 0.46711 0.39728 -0.06671 0.61953
hvyfeel | 0.57369 0.21220 0.42173 0.44798
headache | 0.50816 0.25731 -0.12097 0.66092
throat | 0.37922 0.25219 0.05205 0.78988
To obtain these results, we typed
factor joints-throat, ml factors(3) protect(5)
All Stata commands share the same syntax: the command name is followed by the dependent variable; and then the independent variables; and then, optionally, a comma and any options. We specified factor's ml option, producing estimates by maximum likelihood. We also typed factors(3) to indicate that we wanted to keep the first three factors.
This is an interesting problem because there are two distinct local maxima. Stata has a unique feature to ensure that you have found the global maximum by using different starting points to search out different solutions. protect(5) indicated that this search was to be performed five times.
We find that most of the explained variance can be attributed to the first factor. Stata also shows the unique variance attributed to each variable.
The researcher actually fitting this model interpreted the first factor as a measure of the general level of sickness and the second factor as a difference between musculoskeletal problems and other types of problems. If he had wanted to rotate the factor loadings to search for different interpretations, he could now type rotate to examine an orthogonal varimax rotation; rotate, promax to examine an oblique promax rotation; or, for instance, rotate, promax(4) to examine a promax rotation with promax power 4 (producing simpler loadings but at a cost of more correlation between factors).
Stata’s score command produces estimates of the factors after factor or rotate:
. score f1 (based on unrotated factors) (2 scorings not used) Scoring Coefficients Variable | 1 ----------+---------- joints | 0.15644 cough | 0.04463 backache | 0.56038 nausea | 0.14779 indigest | 0.09986 hvyfeel | 0.16960 headache | 0.10183 throat | 0.06359
Typing score f1 produced estimates of the first factor. Typing score f1 f2 would produce estimates of the first two factors, and typing score f1 f2 f3 (or score f1-f3) would produce estimates of the first three factors. The names f1, f2, etc., are arbitrary; the score command allows you to create new variables that could then be used in analysis.
Stata also has a command for Cronbach’s alpha, providing a simpler way of combining the eight symptoms, assuming that all have equal weight:
. alpha joints-throat, generate(symplev) Scale = sum(unstandardized variables) Average interitem covariance: .3783125 Number of items in the scale: 8 Scale reliability coefficient: 0.7591 . summarize f1 symplev Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- f1 | 3046 4.86e-10 .9314048 -1.254182 3.1028 symplev | 3320 2.021112 .7290644 1 5 . correlate f1 symplev (obs=3046) | f1 symplev --------+------------------ f1| 1.0000 symplev| 0.9343 1.0000
It turns out that the scale created by alpha and the first factor score estimate are highly correlated with each other.
See New in Stata 19 to learn about what was added in Stata 19.