Home  /  Products  /  Capabilities  /  Multivariate methods  /  Factor analysis
The content below is applicable for Stata 8.

## Factor analysis

 Principal-components factor Principal factor Iterated principal factor ML factors Varimax rotation with and without Horst standardization Promax rotation with and without Horst standardization Bartlett scoring Regression scoring

Stata’s factor command allows you to fit common-factor models; see also principal components.

By default, factor produces estimates using the principal-factor method (communalities set to the squared multiple-correlation coefficients). Alternatively, factor can produce iterated principal-factor estimates (communalities re-estimated iteratively), principal-components factor estimates (communalities set to 1), or maximum-likelihood factor estimates.

After you fit a factor model, Stata allows you to rotate the factor-loading matrix using the varimax (orthogonal) and promax (oblique) methods. Stata can score a set of factor estimates using either rotated or unrotated loadings. Both regression and Bartlett scorings are available.

Below we fit a maximum-likelihood factor model on eight medical symptoms from a medical outcomes study (Tarlov et al. 1989) using three factors:

 	. factor joints-throat, ml factors(3) protect(5)

(obs=3046)
Likelihood verification 0, maximum =  -21.8257
Likelihood verification 1, maximum =  -21.8257
Likelihood verification 2, maximum =  -21.8257
Likelihood verification 3, maximum =  -18.4300
Likelihood verification 4, maximum =  -21.8257
Likelihood verification 5, maximum =  -18.4300

Differing maxima obtained.

Iteration 0:  log likelihood =-1925.2187
Iteration 1:  log likelihood =-40.623068
Iteration 2:  log likelihood = -27.38831
Iteration 3:  log likelihood =-26.291917
Iteration 4:  log likelihood = -18.49983
Iteration 5:  log likelihood = -18.43281
Iteration 6:  log likelihood =-18.430164
Iteration 7:  log likelihood =-18.429999
Iteration 8:  log likelihood =-18.429988
Iteration 9:  log likelihood =-18.429988
Iteration 10:  log likelihood =-18.429988

(maximum likelihood factors; 3 factors retained)
Factor     Variance       Difference    Proportion    Cumulative
------------------------------------------------------------------
1        2.36049         1.64310      0.6892         0.6892
2        0.71739         0.37019      0.2095         0.8986
3        0.34720               .      0.1014         1.0000

Test:  3 vs. no   factors.  Chi2(  24) = 4718.59, Prob > chi2 =  0.0000
Test:  3 vs. more factors.  Chi2(   7) =   36.79, Prob > chi2 =  0.0000

Variable |      1          2          3    Uniqueness
----------+-------------------------------------------
joints |   0.62749   -0.07856    0.26240    0.53124
cough |   0.29859    0.14908    0.05009    0.88611
backache |   0.82633   -0.33130   -0.11018    0.19530
nausea |   0.49540    0.49656   -0.25307    0.44396
indigest |   0.46711    0.39728   -0.06671    0.61953
hvyfeel |   0.57369    0.21220    0.42173    0.44798
headache |   0.50816    0.25731   -0.12097    0.66092
throat |   0.37922    0.25219    0.05205    0.78988


To obtain these results, we typed

	factor joints-throat, ml factors(3) protect(5)


All Stata commands share the same syntax: the command name is followed by the dependent variable; and then the independent variables; and then, optionally, a comma and any options. We specified factor's ml option, producing estimates by maximum likelihood. We also typed factors(3) to indicate that we wanted to keep the first three factors.

This is an interesting problem because there are two distinct local maxima. Stata has a unique feature to ensure that you have found the global maximum by using different starting points to search out different solutions. protect(5) indicated that this search was to be performed five times.

We find that most of the explained variance can be attributed to the first factor. Stata also shows the unique variance attributed to each variable.

The researcher actually fitting this model interpreted the first factor as a measure of the general level of sickness and the second factor as a difference between musculoskeletal problems and other types of problems. If he had wanted to rotate the factor loadings to search for different interpretations, he could now type rotate to examine an orthogonal varimax rotation; rotate, promax to examine an oblique promax rotation; or, for instance, rotate, promax(4) to examine a promax rotation with promax power 4 (producing simpler loadings but at a cost of more correlation between factors).

Stata’s score command produces estimates of the factors after factor or rotate:

 	. score f1
(based on unrotated factors)
(2 scorings not used)

Scoring Coefficients
Variable |      1
----------+----------
joints |   0.15644
cough |   0.04463
backache |   0.56038
nausea |   0.14779
indigest |   0.09986
hvyfeel |   0.16960
throat |   0.06359


Typing score f1 produced estimates of the first factor. Typing score f1 f2 would produce estimates of the first two factors, and typing score f1 f2 f3 (or score f1-f3) would produce estimates of the first three factors. The names f1, f2, etc., are arbitrary; the score command allows you to create new variables that could then be used in analysis.

Stata also has a command for Cronbach’s alpha, providing a simpler way of combining the eight symptoms, assuming that all have equal weight:

 	. alpha joints-throat, generate(symplev)

Scale = sum(unstandardized variables)

Average interitem covariance:     .3783125
Number of items in the scale:            8
Scale reliability coefficient:      0.7591

. summarize f1 symplev

Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
f1 |    3046    4.86e-10   .9314048  -1.254182     3.1028
symplev |    3320    2.021112   .7290644          1          5

. correlate f1 symplev
(obs=3046)

|       f1  symplev
--------+------------------
f1|   1.0000
symplev|   0.9343   1.0000


It turns out that the scale created by alpha and the first factor score estimate are highly correlated with each other.

See New in Stata 18 to learn about what was added in Stata 18.

### References

Tarlov, A. R., J. E. Ware, Jr., S. Greenfield, E. C. Nelson, E. Perrin, and M. Zubkoff. 1989. The medical outcomes study.
Journal of the American Medical Association 262: 925–930.