[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: GEE or svy:logit

From   Steven Samuels <>
Subject   Re: st: GEE or svy:logit
Date   Sun, 3 Feb 2008 15:35:26 -0500

From your description, the household survey is not a formal sample of a target population. (The original case and control groups might have been.) So, off-hand, you do not need any -svy- commands at all, ordinary -logit- with a cluster option or -glm- (which does GEE), also with a cluster option, will be sufficient and will give equivalent inference.

A more important question is: will the presence of a large percentage of case-households distort your analysis? For example, if one of your goals is to predict an outcome and that outcome is related to the original case-variable, then it will be over-represented in the sample. To get around this, you might want to down-weight the case- households. If case households constitute, say, 1% of the population (uncommon disease), then consider giving them a weight of '1' and control households a weight of '99'. You can do this in either -glm- or -logit-. Again no -svy- version is needed.

However this is extreme,, and it will not work if your controls were pair-matched to cases. If your outcome is unrelated to the outcome of the case-control study, then go ahead and use the unweighted data. A convenience sample is a convenience sample.


For this study, the individuals are a convenience sample from
households that participated in a case-control "parent" study in which
the case households had contaminated water and control households did

This cross-sectional study includes all of the adults within the case
and control households who consented to participate by completing the
individual questionnaire.


Brenda,  Please give some detail about the survey design: 1. What was
the target population; 2. How did you select the sample-please give
all steps.

On Feb 2, 2008, at 12:49 PM, wrote:

Greetings from a new Statalister.
I am in need of advice, including references, if you have any.

We have done a household-level cross-sectional survey including all consenting adults within the household (1 to 4 per cluster). There are both household-level and personal-level variables.
The dependent variable is nominal at the personal level (ill/not ill).
The focal independent variable is nomial and at the household level (water contaminated/not).
Other variables of interest (explanatory, in relation to focal independent variable) are at the personal and household level.

My question is I need to use GEE to adequately account for clustering within households OR would the svy:logit in Stata do this? (The ICC for illness & household is 0.08, SE 0.08)

Brenda Coleman, PhD candidate

----- End forwarded message -----

*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index