Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: wealth score using principal component analysis (PCA) |

Date |
Tue, 25 Sep 2012 01:07:27 +0100 |

You seem to be misunderstanding both PCA and the syntax of -predict- after -pca-. To take the second first, -predict- just gives you as many components as you ask for. Ask for one by giving one variable name and you get scores for the first PC, regardless of what name you give. Stata's indifferent to what name you give (so long as it is new and legal) and indeed predict p3 predict p777 would give you further identical copies of the first PC. predict P1 P2 would give you scores for the first two PCs. As for PCA there are potentially as many PCs as variables: although the -components()- option puts a self-defined limit on how many you can calculate the main purpose of this option appears to be to let -pca- behave more like -factor-. Even if your purpose is to use just one PC, it usually makes sense to look at several and the relationships of those PCs to your original variables. Sometimes the second, third, ... PC pick up important parts of the variation and it is a good idea to look at those too to see what the first PC is missing. In the case of wealth variables it might be a good idea to think about using PCA on logarithmic transformations of the variables too (assuming all values are strictly positive). Note that the audience of Statalist is very international and interdisciplinary, so that assuming that "DHS" is self-evident is likely to be wrong in many cases. Your last question (c) is unanswerable. Many people do it, but how far it is "OK" in your project depends on your goals and your data, which we can't see. Nick On Mon, Sep 24, 2012 at 9:20 PM, Shikha Sinha <shikha.sinha414@gmail.com> wrote: > I am trying to create a wealth score using the ownership of different > assets in the DHS survey. I am suing -pca but I am not sure how to > estimate the score as I want to use the wealth score as one of the > independent variables. > > pca x1-x4 > predict p1,score > > but -predict only generates score from first component. > > I also tried the following, > > -pca x1-x4, components (2) > predict p2, score > > However, p1 and p2 are same. > > My questions are, (a) why there is no difference between p1 and p2? > (b) How can I generate score by using first 2 components only? > (c) Is it ok to use continuous pca score as an independent variable? > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: wealth score using principal component analysis (PCA)***From:*Stas Kolenikov <skolenik@gmail.com>

**References**:**st: wealth score using principal component analysis (PCA)***From:*Shikha Sinha <shikha.sinha414@gmail.com>

- Prev by Date:
**st: Xtabond or dummy variables** - Next by Date:
**st: Test equality of predictors after logistic regression** - Previous by thread:
**st: wealth score using principal component analysis (PCA)** - Next by thread:
**Re: st: wealth score using principal component analysis (PCA)** - Index(es):