Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# SV: st: exploratory factor analysis with dichotomous and continuous data

 From Frauke Rudolf <[email protected]> To "[email protected]" <[email protected]> Subject SV: st: exploratory factor analysis with dichotomous and continuous data Date Thu, 22 Nov 2012 06:31:19 +0000

```Thank you for your answer, Jay.
I guess, judging from your description, that it is a structural zero. Hemoptysis means coughing up blood, which isn´t really possible without coughing.
However those are not the only variables, so is there a way to loop this problem in the analysis?
I don´t think it makes sense to add pseudo-cases.
Frauke

-----Oprindelig meddelelse-----
Fra: [email protected] [mailto:[email protected]] På vegne af JVerkuilen (Gmail)
Sendt: 21. november 2012 13:29
Til: [email protected]
Emne: Re: st: exploratory factor analysis with dichotomous and continuous data

On Wed, Nov 21, 2012 at 5:37 AM, Frauke Rudolf <[email protected]> wrote:
>
> I found some useful threads on the net, so now I know why I get the message; It is due to one of the dichotomous variables having 0 observations in one of the 2x2 tables:
> haemoptysi |         cough
>          s |         1          2 |     Total
> -----------+----------------------+----------
>          1 |       168          0 |       168
>          2 |       896         53 |       949
> -----------+----------------------+----------
>      Total |     1,064         53 |     1,117
>
> What I could not find, was a solution on, how to deal with this in order to be able to run an EFA.
> I hope you can help me with this.

First of all is this a sampling zero or a structural zero, i.e.,
something that is impossible (silly example: male patients of an
OB/GYN)? I simply don't know the substance to be able to judge. If
it's a structural zero you need to decide if the EFA model is even
appropriate. I ask because this is a pretty big sample and thus a
sampling zero seems unlikely, but I really don't know.

If not, you can add a certain number of pseudo-cases to all cells in
your contingency table. In the loglinear model literature this is
called "flattening" and is often necessary to get reasonable
estimates.

Essentially you have to do this in small doses, adding one, then two
then three, cases, to make sure that the resulting correlations don't
shift dramatically.

Jay
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```

• Follow-Ups: