[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Permutations and logistic regression (Stata 8)
At 7:55 AM -0700 6/29/05, n p wrote:
I have a small dataset with 21 subjects. Seven out of the 21
subjects experienced a specific event during a predetermined amount
of time whereas the remaining did not. I would like to investigate
the effect of various continuous variables which have been measured
at the beginning of the experiment on the probability of the event
while adjusting for gender and weight. I would normally go with a
logistic regression e.g.
xi:logit event continuous_var weight i.gender
but I am thinking that the sample size is too small. Is it correct
to use permutations to deal with the small sample size and if yes is
the following syntax correct?
permute event "xi:logit event continuous_var weight i.gender" _b ,reps(5000)
A nice idea, but I don't believe the command above will give you what
you want. The reason is that although you are using -permute-, you
are still using -logit- (i.e., unconditional maximum likelihood) to
estimate the parameters for each permutation. Thus, you won't have
estimates for those permutations where there is complete separation
(i.e., where your covariates perfectly predict the response), and the
resulting incomplete permutation distribution will be incorrect.
This problem will be particularly pronounced with a small dataset
which, of course, is exactly when the issue of exact analyses arises.
In fact, the typical extreme example is the case where the
unconditional MLE doesn't exist even for the original (un-permuted)
To use -permute- to do hypotheses testing within the context of a
logistic regression model, you'd need to base the test(s) on the
sufficient statistic(s) for your model rather than on the
unconditional maximum likelihood estimate. This would be pretty
straightforward, and I believe you could use -permute- to do it.
However to get actual parameter estimates and standard errors, you'd
need to maximize the appropriate conditional likelihood, and even
then I believe that there are situations where a maximum does not
exist (in such cases, a different estimator is needed).
One more comment. With the -permute- command as you've specified it
above, you are conditioning merely on the total number of events. In
many problems, however, there are other covariates which you may also
wish to regard as nuisance parameters, and your inference should
condition on these as well. If these are discrete variables, you may
be able to use the strata() option of -permute- to obtain the
appropriate conditional permutation distribution.
WARNING: Please note that this is a topic (i.e., exact logistic
regression) which I know almost nothing about, and so you should
consume my response very critically. You can read the theory in Cox
and Snell (1989), and I know that there are several accessible papers
in the statistical literature (sorry, I don't have any references
easily available at the moment). Hopefully those on the list more
knowledgeable than I will correct any misstatements I may have made.
Cox, D. R., and E. J. Snell. 1989. Analysis of Binary Data, 2nd edn.
New York: Chapman & Hall.
* For searches and help try: