Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Reposted: st: cytel challenge


From   Tero T Kivela <tekivela@cc.helsinki.fi>
To   statalist@hsphsun2.harvard.edu
Subject   Reposted: st: cytel challenge
Date   Mon, 29 Mar 2004 20:07:27 +0300 (EET DST)

On Wed, 21 Nov 2001, William Gould wrote:

> Lee Sieswerda <Lee.Sieswerda@tbdhu.com> wrote,
>
> > Cytel makes LogXact for doing "exact" logistic regression (www.cytel.com).
> > In their ads they have something called the "Cytel Challenge" where they ask
> > people to try to fit a logistic regression model to the following data:
> >
> >         Diar        AB        Age        Hosp
> >         0            0          0           0
> >         6            0          0           1
> >         1.9          0          1           0
> >         2.9          0          1           1
> >         100          1          1           1
> >
> > The percentage of patients with diarrhea (Diar) is the outcome and the other
> > three variables are predictors: [...]  Taking the challenge using the
> > -logistic- in Stata fails to produce a converged model. I really don't know
> > the details of how LogXact manages to fit this model, but my question is:
> > would it not be possible to program Stata to do "exact" logistic regression
> > and be able to fit this model? Or is there something inherently different
> > about Cytel's software that it can accomplish this and Stata cannot?
>
> I take issue with Lee's comment that "Stata fails to produce a converged
> model" -- Lee did something wrong -- but I do not take issue with Cytel's
> ad (although I have not seen it).
>
> Something evidently got left out of the ad or the posting because, to do the
> above example, we need to know the population sizes.  Nevertheless, I went to
> Cytel's web site and found a longer problem on which the ad was obviously
> based.  The URL is http://www.cytel.com/new.pages/LX.ex.04.html.  On the
> web, the problem is longer.
>
> In the longer problem, there are more observations, the population is
> included, and there are five independent variables:  Cephelaxin, Clindomycin,
> Sex, Age, and LOS.  In any case, the web site says,
>
>         Challenge: Try fitting a logistic regression model to the data with
>                    all five covariates included.
>
> so let's do that and see exactly the point Cytel wishes to make.
>
> After loading the data, I had 18 observations and the first five looked like
> this:
>
>         . list in 1/5
>
>               diarrhea   totno   cephalex   clindomy   sex   age   los
>           1.         0     174          0          0     0     0     0
>           2.         1     113          0          0     0     0     1
>           3.         0     349          0          0     0     1     0
>           4.        16     451          0          0     0     1     1
>           5.         0     213          0          0     1     0     0
>
> To estimate this model, I must use the -blogit- command since that is
> the Stata's logit command for estimating when the dependent data contain
> counts of the positive outcomes and the total population.  I also specify
> the -or- option to obtain odds ratios.  Here is the result of running the
> model:
>
> ==============================================================================
> . blogit diarrhea totno  cep cli sex age los, or
> note: cephalex~=0 predicts success perfectly
>       cephalex dropped and 2 obs not used
>
>
> Logit estimates                                   Number of obs   =       2488
>                                                   LR chi2(4)      =      91.48
>                                                   Prob > chi2     =     0.0000
> Log likelihood = -218.30047                       Pseudo R2       =     0.1732
>
> ------------------------------------------------------------------------------
>     _outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>     clindomy |   9.198602    2.89523     7.05   0.000     4.963739    17.04648
>          sex |   .8263463   .2336678    -0.67   0.500      .474751    1.438329
>          age |   2.440564   1.176263     1.85   0.064      .948947    6.276803
>          los |   11.84492   7.113316     4.12   0.000      3.65051    38.43354
> ------------------------------------------------------------------------------
> ==============================================================================
>
> What Cytel wants you to notice is Stata's message
>
>         note: cephalex~=0 predicts success perfectly
>               cephalex dropped and 2 obs not used
>
> That is the point of their challange:  When cephalex is nonzero, there is
> always a positive outcome:
>
>         . list if ceph==1
>
>               diarrhea   totno   cephalex   clindomy   sex   age   los
>          17.         1       1          1          0     0     1     1
>          18.         4       4          1          0     1     1     1
>
> There are a total of 5 patients who were observed with cephalex==1 and all
> five patients suffered from diarrhea.  How do you interpret that?  Does that
> mean cephalex==1 always results in diarrhea?  Well, of course it does not.
> With only five such patients, Cytel's computationally intensive methods were
> able to put a confidence interval around the result:  [27.52, infinity].  Very
> nice.  (I would like somebody to explain to me why the point estimate is a
> finite 207.40 rather than infinite, but I'm sure Cytel has carefully
> considered the answers they produce).
>
> In any case, Stata smartly recognized its limitations and estimated the model
> conditional on cephalex==0.  Some other packages might not have recognized the
> problem and gotten messed up in the numerics.
>
> I leave it for you to decide how important it is to put a confidence interval
> around cephalexin in this particular case, but without question there are
> problems for which doing this kind of thing is important.
>
> -- Bill
> wgould@stata.com
> *
> *   Help is available at
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index