Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Abnormal logistic results


From   Ras Dondo <ras.dondo@yahoo.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Abnormal logistic results
Date   Tue, 16 Oct 2012 03:42:42 -0700 (PDT)

Thanks Maarten. But will it make any difference if instead of comparing those exposed and those not exposed to drug X, I decided to compare those with the disease to those without the disease in an attempt to determine the odds of exposure to the drug?

Thanks



----- Original Message -----
From: Maarten Buis <maartenlbuis@gmail.com>
To: statalist@hsphsun2.harvard.edu
Cc: 
Sent: Monday, October 15, 2012 8:55 AM
Subject: Re: st: Abnormal logistic results

On Mon, Oct 15, 2012 at 4:13 AM, Ras Dondo wrote:
> I run logistic regression on my data and got an abnormal results and I wanted to ask for advice on how to rectify the problem. I have a dataset containing five variables:
> 1. condition in a child disease (binary 1/0), 2. mother's age (grouped by 5 year intervals), 3. state (12 states), 4. child's year of birth (grouped into 5 levels), and 5. drug X (binary). My objective was to calculate the OR and associated 95% CI interval of the baby having the disease when the it was exposed to drug in X in the womb, adjusting for maternal age, state, and child's year of birth.
> I had a sample size of 6,168 children of which 89 had the disease with 1 child exposed to the drug, and 6,079 children without the disease with also 1 child exposed to the drug in the womb.

The exposure to the drug is just too rare. That means you have
virtually no information in the dataset. The information in a dataset
that we use in these models comes from comparing groups. The group
exposed to drug X is very small (2 observations), so we know very
little about that group. Even though you have more than 6,000
observations, these observations only contain a lot of information
about the group that is not exposed to drug X. To do a comparison of
those exposed and not exposed you need to know a lot about both
groups.

So I am not surprised that you cannot adjust for mother's age, state,
child's year of birth. You need to simplify your model: e.g. adjust
for rougher groupings of states (in the US context you can think of
south versus non-south) instead of state, adjust for mother's age with
a linear spline with one knot instead of 5 categories, same with
child's year of birth. For linear splines see: -help mkspline-. Also
make sure you center mother's age and child's year of birth at a
meaningful value within the range of the data. Even if you do all
that, I would still not be surprised if you still will not get a
meaningful answer; your data is very extreme with so few observations
that used drug X(*).

Hope this helps,
Maarten

(*) I suspect that this is one of those cases where as a researcher
you would want this to be less rare, but as a person you are glad this
happens so rarely.

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------

*
*   For searches and help try:
*  http://www.stata.com/help.cgi?searchhttp://www.stata.com/support/faqs/resources/statalist-faq/http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index