Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: wealth score using principal component analysis (PCA) |

Date |
Tue, 25 Sep 2012 11:05:04 -0500 |

Regarding (c), you would be best off with a structural equations model (-sem- module), and forgo the PCA whatsoever. -- -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at srbi dot com -- Opinions stated in this email are mine only, and do not reflect the position of my employer On Mon, Sep 24, 2012 at 7:07 PM, Nick Cox <njcoxstata@gmail.com> wrote: > You seem to be misunderstanding both PCA and the syntax of -predict- > after -pca-. > > To take the second first, -predict- just gives you as many components > as you ask for. Ask for one by giving one variable name and you get > scores for the first PC, regardless of what name you give. Stata's > indifferent to what name you give (so long as it is new and legal) and > indeed > > predict p3 > predict p777 > > would give you further identical copies of the first PC. > > predict P1 P2 > > would give you scores for the first two PCs. > > As for PCA there are potentially as many PCs as variables: although > the -components()- option puts a self-defined limit on how many you > can calculate the main purpose of this option appears to be to let > -pca- behave more like -factor-. > > Even if your purpose is to use just one PC, it usually makes sense to > look at several and the relationships of those PCs to your original > variables. Sometimes the second, third, ... PC pick up important parts > of the variation and it is a good idea to look at those too to see > what the first PC is missing. In the case of wealth variables it might > be a good idea to think about using PCA on logarithmic transformations > of the variables too (assuming all values are strictly positive). > > Note that the audience of Statalist is very international and > interdisciplinary, so that assuming that "DHS" is self-evident is > likely to be wrong in many cases. > > Your last question (c) is unanswerable. Many people do it, but how far > it is "OK" in your project depends on your goals and your data, which > we can't see. > > Nick > > On Mon, Sep 24, 2012 at 9:20 PM, Shikha Sinha <shikha.sinha414@gmail.com> wrote: > >> I am trying to create a wealth score using the ownership of different >> assets in the DHS survey. I am suing -pca but I am not sure how to >> estimate the score as I want to use the wealth score as one of the >> independent variables. >> >> pca x1-x4 >> predict p1,score >> >> but -predict only generates score from first component. >> >> I also tried the following, >> >> -pca x1-x4, components (2) >> predict p2, score >> >> However, p1 and p2 are same. >> >> My questions are, (a) why there is no difference between p1 and p2? >> (b) How can I generate score by using first 2 components only? >> (c) Is it ok to use continuous pca score as an independent variable? >> > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: wealth score using principal component analysis (PCA)***From:*Shikha Sinha <shikha.sinha414@gmail.com>

**References**:**st: wealth score using principal component analysis (PCA)***From:*Shikha Sinha <shikha.sinha414@gmail.com>

**Re: st: wealth score using principal component analysis (PCA)***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: From: Rodrigo Briceño <rjbriceno@gmail.com>** - Next by Date:
**Re: st: Chi2 test on weighted data** - Previous by thread:
**Re: st: wealth score using principal component analysis (PCA)** - Next by thread:
**Re: st: wealth score using principal component analysis (PCA)** - Index(es):