Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Kim Peeters <kimpeeters84@yahoo.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: xtlogit, pa questions |

Date |
Thu, 8 Mar 2012 01:29:19 -0800 (PST) |

Dear Brendan and David, Within each cluster the value of the response variable is either 0 or 1 for all occasions. This is not due to any design but just because none of the participants (one participant = one clusters) in the study experienced a transition from one state to the other state (no data or study error). Both the response variable and the RHS variables are measured yearly. The response variable does not vary but recall that this was not done by design. The RHS variables do vary for each occasion (i.e. for every year). The study endeavors to understand the relationship between the response variable (either 0 or 1) and the independent variables. Thank you for any help you can provide. Best regards, Kim ----- Original Message ----- From: David Hoaglin <dchoaglin@gmail.com> To: statalist@hsphsun2.harvard.edu Cc: Sent: Tuesday, March 6, 2012 3:52 PM Subject: Re: st: xtlogit, pa questions Please clarify the following point: > 3. Within each cluster, the response variable is > always the same (either 0 or 1). As such, is a population averaged logit modeling > approach still > statistically valid? Are you saying that, within a cluster, either the value of the response variable is 0 for all occasions or the value of the response variable is 1 for all occasions (i.e., the values within a cluster are not a mixture of 0s and 1s)? David Hoaglin ----- Original Message ----- From: Brendan Halpin <brendan.halpin@ul.ie> To: statalist@hsphsun2.harvard.edu Cc: Sent: Tuesday, March 6, 2012 1:43 PM Subject: Re: st: xtlogit, pa questions On Tue, Mar 06 2012, Kim Peeters wrote: > You mention “If they do, it seems to me that cluster membership has a > stronger effect than the covariates, probably for structural reasons.” > Does this imply that there exist better models to establish the > relationship between the response variable and the independent > variables? If I understand it correctly, you have multiple X measurements per person, but only a single Y? Does the outcome come after all the RHS measurements? If there is, by design, a single outcome per case, then this is not really clustered data, but rather individual-level data with multiply observed explanatory variables: Y_i = f(X_{i1},X_{i2},...X_{iT}) rather than Y_{it} = f(X_{it}) Rather than a clustered model, I'd be looking for a way of simplifying the RHS variables, e.g., via factor analysis. If they are categorical you might be able to cluster them using sequence analysis (depending on how time functions, and on your substantive question). Regards, Brendan ----- Original Message ----- From: Kim Peeters <kimpeeters84@yahoo.com> To: Statalist <statalist@hsphsun2.harvard.edu> Cc: Sent: Tuesday, March 6, 2012 11:34 AM Subject: st: xtlogit, pa questions Dear, I am analyzing a population averaged logit model for panel data (xtlogit, pa), which is equivalent to a Generalized Estimating Equation model (xtgee) when the link function is logit, the distribution of the response variable is binomial and the correlation structure within the response variable is exchangeable. I have read numerous papers and text books (including Generalized Estimating Equations by Hardin and Hilbe (2003)) but I have some unresolved questions: 1. I report the odds ratio for each independent variable. I suppose that the interpretation of the odds ratio is similar to the interpretation of odds ratios in the standard logistic regression. Is this correct? 2. When I adjust my standard errors for clustering, the obtained semi-robust standard errors are smaller than the standard errors I obtain when I do not adjust my standard errors for clustering? This appears to be counter-intuitive? Is this phenomenon valid and how should I interpret this? 3. Within each cluster, the response variable is always the same (either 0 or 1). As such, is a population averaged logit modeling approach still statistically valid? Thank you for any help you can provide. Best regards, Kim * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: xtlogit, pa questions***From:*brendan.halpin@ul.ie (Brendan Halpin)

**References**:**st: xtlogit, pa questions***From:*Kim Peeters <kimpeeters84@yahoo.com>

**Re: st: xtlogit, pa questions***From:*David Hoaglin <dchoaglin@gmail.com>

- Prev by Date:
**Re: st: Copying Stata code with line numbers** - Next by Date:
**RE: st: Copying Stata code with line numbers** - Previous by thread:
**Re: st: xtlogit, pa questions** - Next by thread:
**Re: st: xtlogit, pa questions** - Index(es):