-----Original Message----- From: Peter Muhlberger <pmuhl1848@gmail.com> Date: Thu, 24 Sep 2009 05:01:25 To: statalist@hsphsun2.harvard.edu<statalist@hsphsun2.harvard.edu> Subject: st: svyset problem 2: using svy with partially complete surveys I'm struggling with a question of how to efficiently set up a complex survey analysis. After collecting the data (with simple random sampling, kind of) it is clear that two variables (simplifying here) matter for the kinds of outcomes I'm examining: the % low English proficiency (lep) in a school and the gender of the respondent. I have auxiliary data that tells me, for all schools in the population, what the school size is and what its lep and gender numbers are. To reweight my sample to (hopefully) make it somewhat more like the population, I could, create a pweight that indicates, for each person in my data, how many people in the population they represent that are of the same gender and in a school of the same (median split) category of lep. I can then use the svy commands for estimation. The problem, however, is that I have a fair number of partially complete surveys. Thus, depending on what variables go into a particular analysis, my N varies. Consequently, the pweights would have to be recalculated for almost every analysis. Very time consuming. An alternative I've considered is to define strata that identify unique combinations of lep and gender and then feeding this information to the poststratification options in svyset. Problem here is that each PSU, school, now overlaps two strata--one for each gender in that school--and it's not clear what the FPC numbers should be for each strata. Am guessing this arrangement will probably violate assumptions behind svy. Does anyone know of a better way to address this problem? Peter

