Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# SV: st: exploratory factor analysis with dichotomous and continuous data

 From Frauke Rudolf To "statalist@hsphsun2.harvard.edu" Subject SV: st: exploratory factor analysis with dichotomous and continuous data Date Thu, 22 Nov 2012 06:31:19 +0000

```Thank you for your answer, Jay.
I guess, judging from your description, that it is a structural zero. Hemoptysis means coughing up blood, which isn´t really possible without coughing.
However those are not the only variables, so is there a way to loop this problem in the analysis?
I don´t think it makes sense to add pseudo-cases.
Frauke

-----Oprindelig meddelelse-----
Fra: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af JVerkuilen (Gmail)
Sendt: 21. november 2012 13:29
Til: statalist@hsphsun2.harvard.edu
Emne: Re: st: exploratory factor analysis with dichotomous and continuous data

On Wed, Nov 21, 2012 at 5:37 AM, Frauke Rudolf <FRAUKE.RUDOLF@ki.au.dk> wrote:
>
> I found some useful threads on the net, so now I know why I get the message; It is due to one of the dichotomous variables having 0 observations in one of the 2x2 tables:
> haemoptysi |         cough
>          s |         1          2 |     Total
> -----------+----------------------+----------
>          1 |       168          0 |       168
>          2 |       896         53 |       949
> -----------+----------------------+----------
>      Total |     1,064         53 |     1,117
>
> What I could not find, was a solution on, how to deal with this in order to be able to run an EFA.
> I hope you can help me with this.

First of all is this a sampling zero or a structural zero, i.e.,
something that is impossible (silly example: male patients of an
OB/GYN)? I simply don't know the substance to be able to judge. If
it's a structural zero you need to decide if the EFA model is even
appropriate. I ask because this is a pretty big sample and thus a
sampling zero seems unlikely, but I really don't know.

If not, you can add a certain number of pseudo-cases to all cells in
your contingency table. In the loglinear model literature this is
called "flattening" and is often necessary to get reasonable
estimates.

Essentially you have to do this in small doses, adding one, then two
then three, cases, to make sure that the resulting correlations don't
shift dramatically.

Jay
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```