Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Yashin <yashin5@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Polychoric PCA error message |
Date | Tue, 29 Jan 2013 22:34:29 -0500 |
Dear Statalisters: I am trying to run polychoric PCA from Stas Kolenikov on a data subset (wealth index) that--pre-winnowing--has 32 dichotomous variables, four ordinal variables, and one continuous variable. I am getting the following error messages, repeated times: could not calculate numerical derivatives missing values encountered numerical derivatives are approximate nearby values are missing I found the following thread addressing this issue, http://www.stata.com/statalist/archive/2012-11/msg00826.html and similarly I also found that for those coefficients in the correlation matrix that are either zero or > 0.9, the 2x2 tables invariably have a cell with small numbers (usually 0, and in other cases 1, 2, 3 and in one case a 7). In this case, this would not be a structural zero but a sampling zero. I have related questions I am hoping someone might help shed light on: 1) When I examined the six 2x2 tables for variable pairs with correlation coefficients > 0.9, they did not appear to be highly correlated, and further, included one cell with 0 I'm copying a couple of examples below: . /* tabulate high correlation pairs */ . tab vacuum carpet | carpet vacuum | 0 1 | Total -----------+----------------------+---------- 0 | 21 835 | 856 1 | 0 342 | 342 -----------+----------------------+---------- Total | 21 1,177 | 1,198 . tab computer stove | stove computer | 0 1 | Total -----------+----------------------+---------- 0 | 12 1,033 | 1,045 1 | 0 146 | 146 -----------+----------------------+---------- Total | 12 1,179 | 1,191 2) When I run the polychoric with only the dichotomous variables, and then with the same variables plus the additional 5 variables described above (ordinal and continuous), I get different correlation coefficients in the correlation matrix for the same variable pairs. How could this be? Sometimes the values are similar and yet different, and in other cases the values are quite different (some of the correlations > 0.9 when binary, ordinal and continuous variables are included in the matrix become zero when only binary variables are included in the matrix). 3) To address the issue of 2x2's with zeros, one colleague suggested flattening in the previous thread ( http://www.stata.com/statalist/archive/2012-11/msg00829.html )--I wondered if there are other options. Many thanks for any thoughts! Yashin -- ysl * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/