Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Exploratory factor analysis using a mix of categorical and continuous variables

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Exploratory factor analysis using a mix of categorical and continuous variables Date Fri, 3 Feb 2012 09:52:33 +0000

```I don't see that this can be easily classified as correct or incorrect.

Some people recommend strongly against such a mix, and some people
would argue that the pragmatic defence is whether it provides
interesting or useful results.

As you yourself have labelled the exercise "exploratory" the acid test
is surely what do you learn from the data by examining the results.

This is a cross-disciplinary list and we can only report our own
perspectives. In what is nominally my own discipline, geography,
factor analysis in the sense of an exploratory exercise throwing all
the data into one pot, stirring and seeing what you got, was probably
the most popular technique of all in the late 1960s and early 1970s. I
met several people whose one statistical idea was to read everything
into SPSS and do a factor analysis. I even met some people who did not
know that there were simpler statistical techniques. In geography this
fashion faded rapidly as too many people did not understand what they
were doing or found no useful new results. However, I am now touching
on quite different stories.

In terms of your question, my only guess is that from your variable
names you have a ragbag here and you won't find much interesting or
useful structure. It is better to decide what are your response or
outcome variables that you most want to explain or predict and think
how those might be modelled. The basic problem is not soluble by using
a slightly different multivariate command. Having a mix of predictor
types, dummies, categorical and continuous variables, is of course a
soluble problem.

Nick

On Fri, Feb 3, 2012 at 9:32 AM, Urmi Bhattacharya <ub3@indiana.edu> wrote:

> I am using exploratory factor analysis to generate factor loadings and
> the corresponding uniqueness values using 16 variables. I have a mix
> of dummy variables (taking values 1 or 0), categorical variables
> (positive integers), and continuous variables. I am using the
> following command:
>
> factor govt_school chais_desk_s schl_toilet_s schl_water_s
> tuit_fee_ gen_s pupil_teach_s inservice_training_s library_s
> computer_use_s playgrnd_s  formal_teach_eval_s distance_primaryschool,
> ipf factor(1).
>
> My question is whether this is the correct procedure to use when I
> have variables that are not continuous? If not, is there a command in
> Stata that better handles this?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```