# Re: st: Need Help with Clusters and Weighting

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Need Help with Clusters and Weighting Date Fri, 15 Feb 2008 10:13:49 -0500

Liliana, The information that you have provided is not detailed enough for me to provide you much help. I cannot tell from your description, for example, if your data ARE a valid probability sample of Harris County. What was the sampling frame? Was it an area sample? What were the 'strata'; what were 'clusters'? How were they selected? What were 'samples'? How were THEY selected? Was each 'sample' one person? What were the probabilities of selection.

The way that you phrase your questions suggests that you are on the right track, but don't have much knowledge yet of sampling. I suggest that you consult with the survey scientists who designed the sample and also with the biostatistics faculty members who teach sampling at UTH. They will suggest a good book for background reading. Ask your advisor if one of them can be on your doctoral committee. Statalist is really not a good place to learn sampling, but we will be happy to help you with Stata commands once you understand the sample details. Make sure that you consult the Stata survey -help- and Survey Manual too.

Good luck!

-Steven

On Feb 14, 2008, at 6:14 PM, Rodriguez, Liliana F wrote:

Hello Stata Users List,

I'm working on my dissertation and need to analyze a small dataset (n=210) by logistic regression (with odds ratios). I have nine variables (all categorical or binomial) and my outcomes are binary Y/N. Data was obtained conducting a survey from 7 samples in each of 30 clusters. The clusters were soft-stratified in 3 groups according to income levels, as low, medium and high. I want to be sure than my sample is representative of the population in Harris County, Texas.

Should I weight for the clustering to analyze within a stratum? If yes, what are the commands?

What are the commands to do weighting for the strata, and the clustering, when analyzing over the entire sample?

Do I use income levels, crude income, or the population in each cluster as the weight?

Thanks.... I still have a lot to learn.

Liliana Rodriguez
Houston, Texas

