Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Bootstrapping factor loadings Date Tue, 28 Feb 2012 12:15:58 +0000

```The underlying point is that there is an arbitrariness of sign in
factor analysis results, as your linear algebra text may or may not
flips sign depending on the order of variable input. That's not
bootstrap sampling variation, but it's also true that in a large
enough bootstrap sample, some of the loadings will differ in sign from
the majority vote. This is one of many bootstrap problems in which it
is salutary to look at the sampling distribution, not just confidence
intervals.

Nick

. sysuse auto
(1978 Automobile Data)

. factor weight mpg
(obs=74)

Factor analysis/correlation                        Number of obs    =       74
Method: principal factors                      Retained factors =        1
Rotation: (unrotated)                          Number of params =        1

--------------------------------------------------------------------------
Factor  |   Eigenvalue   Difference        Proportion   Cumulative
-------------+------------------------------------------------------------
Factor1  |      1.45871      1.61435            1.1194       1.1194
Factor2  |     -0.15564            .           -0.1194       1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated:  chi2(1)  =   76.43 Prob>chi2 = 0.0000

---------------------------------------
Variable |  Factor1 |   Uniqueness
-------------+----------+--------------
weight |  -0.8540 |      0.2706
mpg |   0.8540 |      0.2706
---------------------------------------

. factor mpg weight
(obs=74)

Factor analysis/correlation                        Number of obs    =       74
Method: principal factors                      Retained factors =        1
Rotation: (unrotated)                          Number of params =        1

--------------------------------------------------------------------------
Factor  |   Eigenvalue   Difference        Proportion   Cumulative
-------------+------------------------------------------------------------
Factor1  |      1.45871      1.61435            1.1194       1.1194
Factor2  |     -0.15564            .           -0.1194       1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated:  chi2(1)  =   76.43 Prob>chi2 = 0.0000

---------------------------------------
Variable |  Factor1 |   Uniqueness
-------------+----------+--------------
mpg |  -0.8540 |      0.2706
weight |   0.8540 |      0.2706
---------------------------------------

On Tue, Feb 28, 2012 at 12:06 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> I'd add a note of caution here. Factors can get flipped around as part
> of sampling variation, which will change the sign of the loadings and
> -- especially when loadings are large and more interesting -- inflate
> the bootstrap error.
>
> We could debate whether this is part of the problem or it makes more
>
> Nick
>
> On Tue, Feb 28, 2012 at 10:15 AM, Grant, Robert
> <Robert.Grant@sgul.kingston.ac.uk> wrote:
>> Following an earlier thread (http://www.stata.com/statalist/archive/2012-02/msg00036.html), a fellow Statalister asked me off-list about extending this to more than one factor. This is pretty easy to do once you have got the idea of the requirements of -bstat- but I include my suggested code here in case it is of use to anyone in the future:
>>
>> If you have more than one factor, the e(r_L) matrix will have more than one column, one for each factor. If you are using -pca- instead, the same loadings matrix will be called e(b). You need to rearrange them into a single-column vector which here I call obs, and that contains point estimates which -bstat- will then access. If you are not interested in inference for extra stuff such as the % variance explained, then it is simple:
>>
>> // example begins -------------------------------------------
>> // first, get the observed point estimates:
>> factor var1 var2 var3 ... var26, pcf factors(4) // here there are 4 factors and 26 variables
>> rotate, promax // I hope this makes sense - I "don't do" oblique rotations
>> forvalues i=1/4 {
>> }
>>
>> // then carry on with the program...
>> // example ends --------------------------------------------
>>
>> Or if you need extra stuff, have a loop for columns within each loop for rows:
>>
>> // example begins -------------------------------------------
>> // first, get the observed point estimates:
>> factor var1 var2 var3 ... var26, pcf factors(4) // here there are 4 factors and 26 variables
>> rotate, promax // I hope this makes sense - I "don't do" oblique rotations
>> forvalues i=1/26 {
>>        forvalues j=1/4 {
>>        }
>> }
>> // I was interested in % variance explained - you might want to add other stats in.
>> scalar varexpl=e(rho)
>> // now put it back together:
>>                                .
>>                                .
>>                                .
>>                                .
>>            varexpl)
>> /* then carry on with the program...
>> but be very careful to cite the individual loadings and stats within -simulate- in exactly the same order as above; here I have gone across rows then down columns which looks nicer as j<i but is slightly unconventional in loadings I suppose */
>>
>> // and here comes the program...
>> capture: program drop myboot
>> program define myboot, rclass
>>        preserve
>>        bsample
>>        factor var1 var2 var3 ... var26, pcf factors(1)
>>        rotate, promax
>>       forvalues i=1/26 {
>>        }
>>        scalar bootexp=e(rho)
>>      restore
>> end
>>
>> // now you use -simulate- to run the -myboot- program, creating one resample each time.
>>        myboot
>> bstat, stat(obs) n(999) // put the original number of observations into n()
>> estat bootstrap, all
>>
>> // example ends --------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```