[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Aren't distinct factors from factor analysis or PCA orthogonal to each other?

From   kokootchke <>
To   statalist <>
Subject   RE: st: RE: Aren't distinct factors from factor analysis or PCA orthogonal to each other?
Date   Mon, 17 Aug 2009 17:15:33 -0400

Thank you to Cameron, Bob and everybody else for the references.

I have a response to Jay and a couple more questions for everybody, if you can still help me...

Jay wrote:
> Before you go any further I think you have a big problem to consider: 100 variables on, say 200 countries means you have WAY more covariances (or correlations) than you have countries. This means your correlation matrix is singular.

I don't think I have that problem because I don't have 200 countries. I only have about 30+ countries. 

However, even if I had 200 countries, I don't understand exactly what the problem would be because I have all 100 variables for country i and all 100 variables for country j stacked on one another. So, I have:

country    year  GDP   inflation reserves
Argentina  1990  2.3   6.4       100
Argentina  1991  2.8   7.4       250
Argentina  1992  2.6   7.0       200
Argentina  2006  3.2   8.0       400
Brazil     1990  1.7   5.4       120
Brazil     1991  2.1   6.3       140
Brazil     1992  2.5   7.0       180

So the variables I enter into my factor analysis are GDP, inflation, and reserves... and so the -factor- command in Stata knows nothing about the panel/time-series structure of my data. I can see why it should be relevant to account for the underlying panel structure of the data -- for instance, that jump in GDP/inflation/reserves and any other variables between Argentina in 2006 and Brazil in 1990 may be a bit strange to account for.

So, the first question is: do I need to take this panel structure into account? And if so, how?

The other question is, do units matter? For instance, I know that factor analysis or PCA are all based on a variance-covariance matrix... but if I have two variables, x and y, and I take the covariance between the two of them, that'll be different than if I take the covariance of, say 2x and y:

cov(x,y) <> cov(2x,y)

and so what would happen if I express my GDP in dollars for all countries or in local-currency units?? Or in millions or in billions???

Thank you once again.


Hotmail® is up to 70% faster. Now good news travels really fast.
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index