Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
<[email protected]> |

To |
[email protected] |

Subject |
Re: st: "Separation" issue in clustered/Longitudinal binary data. |

Date |
Wed, 22 Dec 2010 15:01:51 -0600 (CST) |

You are absolutely right! In my case, the research design didn't collect the "side effects" data if the respondents were not on meds (there was a skip pattern). Usually in clinical research, there will be a placebo group serving as the reference (so that the denominator in odds ratio won't be a true zero.) I just hoped that the baseline data, where most of the respondents who were not on med and didn't report any SE, could serve as their own control group. But I forgot the most basic assumption of odds ratio. Thank you so much for your reminding. However, just out of my curiosity - although my question has been well solved - is there any way to model total/quasi- separated binary outcomes in longitudinal data? I find now the firth correction is available for logistic and Cox proportion regression, but can't find any equivalence for longitudinal data. I can easily foresee this could be an issue in longitudinal data analysis if the outcome variable a is binary variable. Best, and happy Christmas. Cheenghee M Koh ---- Original message ---- >Date: Wed, 22 Dec 2010 10:23:21 +0000 (GMT) >From: [email protected] (on behalf of Maarten buis <[email protected]>) >Subject: Re: st: "Separation" issue in clustered/Longitudinal binary data. >To: [email protected] > >--- On Wed, 22/12/10, [email protected] asked: >> > The outcome variable is a binary variable (a patient >> > reported drug's side effect) with repeated measures for >> > three waves. Now I have an intervention (whether the >> > participant received the drug). <snip> > >--- On Wed, 22/12/10, Maarten buis answered: >> I may be missing something obvious, but don't you need to >> use the drug in order to experience its side-effects. <snip> >> If something like that is happening in your data, then it is >> hard to see how an "effect" of your treatment could have a >> meaningful substantive interpretation. > >To expand a bit on this answer: The problems with seperation >are a logical consequence of how we define effects in "logit- >like-models". The effect is a ratio of odds. Consider the >example below: > >*--------------- begin example ------------------ >// get some data and prepare it >sysuse auto, clear >gen byte good = rep78 > 3 if rep78 < . >gen byte baseline = 1 > >// estimate a logistic regression >logit good i.foreign baseline, or nocons >*---------------- end example --------------------- >(For more on examples I sent to the Statalist see: >http://www.maartenbuis.nl/example_faq ) > >The number reported for baseline is the baseline odds, >the number of successes per failure for someone (in this >case somecar) who has the value 0 on all covariates. So >for a domestic (=US) car we expect to to find .297 cars >with a good repair record for every car with a bad repair >record. The effect of foreign tells us that the odds of >having a good repair record is 20.18 times larger for >foreign cars than domestic cars. > >It is also instructive to look at the individual odds. >In the example below we did not leave the variable for >the reference category out of the model, but instead >excluded the constant. > >*---------------- begin example ------------------- >// get the odds for foreign and domestic cars >logit good ibn.foreign, nocons or > >// odds ratio is a well chosen name for this statistic, >// as it is literaly a ratio of odds >di exp(_b[1.foreign])/exp(_b[0.foreign]) >*----------------- end example -------------------- > >Here we see that as before the odds of having a good >repair record is .297 good cars for every bad car. We >can now also see that the odds of having a good repair >record is 6 good cars for every bad car. The odds ratio >we found in the first example is thus literally the >ratio of these odds. > >In your case your baseline odds is 0: for patient who >have not been given the drug there are 0 patients who >experience the side-effects for every patient who did >not experience the side-effects. How many times larger >is the odds of experiencing the side-effects if the >baseline is 0? There is just no answer to that question. >You can also see that by noticing that the odds ratio is >in that case some number divided by 0, which is undefined. > >As I understand it, what commands like -firthlogit- do is >assume that the baseline odds isn't really 0 in the >population, but that the odds is so small that just >because of randomness your sample by accident did not find >any successes in your baseline group. However, if the >baseline odds is truely 0, as is in your case probably by >definition the case, than these methods can not help. You >can run these programs, but the results just don't mean >anything. > >Hope this helps, >Maarten > >-------------------------- >Maarten L. Buis >Institut fuer Soziologie >Universitaet Tuebingen >Wilhelmstrasse 36 >72074 Tuebingen >Germany > >http://www.maartenbuis.nl >-------------------------- > > > > >* >* For searches and help try: >* http://www.stata.com/help.cgi?search >* http://www.stata.com/support/statalist/faq >* http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: "Separation" issue in clustered/Longitudinal binary data.***From:*Maarten buis <[email protected]>

**Re: st: "Separation" issue in clustered/Longitudinal binary data.***From:*Maarten buis <[email protected]>

- Prev by Date:
**Re: st: how to automate sorting and how to automate extracting info from a sort** - Next by Date:
**Re: st: Dispersion parameter for a Negative Binomial model within GEE framework** - Previous by thread:
**Re: st: "Separation" issue in clustered/Longitudinal binary data.** - Next by thread:
**st: Dispersion parameter for a Negative Binomial model within GEE framework** - Index(es):