Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpreting Polychoric PCA results in STATA 11


From   Stas Kolenikov <[email protected]>
To   [email protected]
Subject   Re: st: Interpreting Polychoric PCA results in STATA 11
Date   Mon, 8 Aug 2011 13:28:56 -0400

-polychoric- (and -polychoricpca-, which is a wrapper for -polychoric
... , pca-) does all the work that is needed (I am not sure about the
scaling by the eigevalues issue though, but you'd only need that if
you have several scores that you want to have on comparable scales.
Eigenvalue > 1 is a useful, but not necessarily the best criteria).
You'd want to read our paper with Gustavo Angeles
(http://www.citeulike.org/user/ctacmo/article/4090868) where we
discuss whether you need to create the separate dummy variables (short
answer: you don't). It looks like you'd benefit from general reading
on PCA, too, e.g., from
http://www.citeulike.org/user/ctacmo/article/553295): adding the
scores, weighted or not, is an absolutely meaningless operation.

On Mon, Aug 8, 2011 at 4:56 AM, kiran javaid <[email protected]> wrote:
> Hello everybody,
>
> I want to form a wealth index at household level by using variables
> like housetype(mud=1, bricks_mud=2, bricks_cement=3), household owner
> (rented=1 owned=2), electricity (no=0, yes=1), mobilephone (range is
> from 0 - the number of mobile phones a household has, for my sample,
> the maximum is 13), cycles (no cycle=0, yes cycle = 1,2,3... the
> number of cycles owned) etc. As the variables are categorical I should
> use polychoric pca instead of simple pca, right? One question I have
> is that if i use polychoric pca then do i need to generate a seperate
> variable for each category of these variables? for instance, in the
> household type category, should i have one variable as housetype_mud
> (mud house=1, not mud house=0), then another as housetype_bricks_mud
> (bricks_mud house=1, not bricks_mud house=0) and similarly for
> bricks_cement? but then i would have to leave one category out to
> avoid multicollinearity, right? Furthermore, if this is the case then
> what happens to the mobile phone and cycle variables? Do i still have
> just one variable for mobile phone (going from 0 to 13) and one for
> cycle as it currently is?
>
> Secondly, the command i'm using for polychoric pca is: polychoricpca
> housetype houseowner electricity mobilephone cycle, score(index)
> nscore(3)
> (ofcourse, if i need to generate seperate variables for all categories
> then those will replace housetype, houseowner...)
>
> My question is that this one command will give me the factors
> generated? Since i typed nscore (3), it will give me 3 factor
> components. However, the eigenvalues are greater than 1 for only 2
> factor components. So in order to get one composite index (wealth
> index) i should multiply index1 with the eigenvalue for index1, and
> index2 with eigenvalue for index2 and then add these two up? like we
> do in simple pca?
>
> Any and all help will be greatly appreciated.
>
> - Kiran
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index