Glenn Hoetker <ghoetker@uiuc.edu>

statalist@hsphsun2.harvard.edu

st: Factor analysis(?) question - missing data

Tue, 22 Apr 2008 13:06:31 -0500

Hi all.

This is perhaps more of a statistical questions than a Stata question. My situation is this. I have a large dataset in which there are 5-6 indicators each for a bunch of latent variables. Let me take as an example having 5 measures for innovative output, x1-x5. The problem is that very few observations have all 5 measures; some are missing x1, some x2, etc. Almost every observation has at least 3 measures and most 4.

Is there anyway to optimally combine these indicators to measure the underlying construct of innovative output that would use all available measures for a given observation, i.e., x1-x4 for one observation, [x1- x3,x5] for another, etc. If I thought these were equally weighted, I could just average over the available variables in each, setting aside issues of measurement error. However, I'm not convinced they are equally weighted and would like to do this in a more rigorous fashion.

Any suggestions would be most welcome. Thanks in advance.

GPH

Glenn Hoetker

Associate Professor (Business, Law, Institute for Genomic Biology)

Resident Associate, Center for Advanced Study, Science and Technology in the Pacific Century (STIP) initiative

Faculty Fellow, Academy for Entrepreneurial Leadership

University of Illinois

217-265-4081

ghoetker@uiuc.edu

Personal website: www.business.uiuc.edu/ghoetker

Science & Technology in the Pacific Century initiative website: www.business.uiuc.edu/stip

