Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Abnormal logistic results

From   Maarten Buis <>
Subject   Re: st: Abnormal logistic results
Date   Mon, 15 Oct 2012 09:55:02 +0200

On Mon, Oct 15, 2012 at 4:13 AM, Ras Dondo wrote:
> I run logistic regression on my data and got an abnormal results and I wanted to ask for advice on how to rectify the problem. I have a dataset containing five variables:
> 1. condition in a child disease (binary 1/0), 2. mother's age (grouped by 5 year intervals), 3. state (12 states), 4. child's year of birth (grouped into 5 levels), and 5. drug X (binary). My objective was to calculate the OR and associated 95% CI interval of the baby having the disease when the it was exposed to drug in X in the womb, adjusting for maternal age, state, and child's year of birth.
> I had a sample size of 6,168 children of which 89 had the disease with 1 child exposed to the drug, and 6,079 children without the disease with also 1 child exposed to the drug in the womb.

The exposure to the drug is just too rare. That means you have
virtually no information in the dataset. The information in a dataset
that we use in these models comes from comparing groups. The group
exposed to drug X is very small (2 observations), so we know very
little about that group. Even though you have more than 6,000
observations, these observations only contain a lot of information
about the group that is not exposed to drug X. To do a comparison of
those exposed and not exposed you need to know a lot about both

 So I am not surprised that you cannot adjust for mother's age, state,
child's year of birth. You need to simplify your model: e.g. adjust
for rougher groupings of states (in the US context you can think of
south versus non-south) instead of state, adjust for mother's age with
a linear spline with one knot instead of 5 categories, same with
child's year of birth. For linear splines see: -help mkspline-. Also
make sure you center mother's age and child's year of birth at a
meaningful value within the range of the data. Even if you do all
that, I would still not be surprised if you still will not get a
meaningful answer; your data is very extreme with so few observations
that used drug X(*).

Hope this helps,

(*) I suspect that this is one of those cases where as a researcher
you would want this to be less rare, but as a person you are glad this
happens so rarely.

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index