Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Logistic regression


From   "Statann Lindberg" <[email protected]>
To   [email protected]
Subject   st: Logistic regression
Date   Tue, 21 Jan 2003 13:52:42 +0000

Dear all,

I am trying to refresh my statistical knowledge and at the same time refresh my experience with Stata (I currently run version 6, but will upgrade to 8). Now I desperately need some guidance on the best way to approach an analysis of a data set I have.

The data come from a study where 19 pregnant sows (16 with infected foetuses, 3 infected with healthy foetuses) have been tested for antibodies (expressed as an optical density (OD)) repeatedly during gestation. Each sow has been tested 8 times.
Basically I want to know if OD and the day in gestation (DIG) when the sow was tested can predict whether the foetus is infected or not. I anticipate that there may be an interaction between OD and DIG so I also want to include an interaction term.

Typing:
logistic inf od*dig, cluster(id) asis
only gives estimates for the main effects (od and dig) and not for the effect modifier that I wanted to test. Why is this?

However, if I in a previous step create an interaction term by
generate od_dig = od*dig, and then type
logistic inf od*dig, cluster(id) asis
Stata somehow identifies od_dig as something I want in the model.
Something I had not expected, but OK.

However, I also have a multicollinearity problem (high correlation between the interaction term and the main effects) so I experimented to try and reduce it by centering od (created a new variable called od2 = od - mean(od)) and a new interaction term od2_dat = od2*dig).

Then at some point in time I ran my first version again:
logistic inf od*dig, cluster(id) asis
and got the message:
Note: od dropped due to collinearity.
Note: od2_dat dropped due to collinearity.

Hmm. I didn't ask for od2_dat to be in the model?? How come it was included?
(I understand that I probably lack some vital Stata info here, so please excuse me!)

Then I typed:
logistic kdpi od2 andigat od2_dat, cluster(id) asis
which appears to work.

Apart from this interaction confusion, I am still uncertain if I have chosen a reasonably correct way of analysing these data, considering that I have repeated measurements on individuals. I hope that the fact that I specified the cluster option will account for that, but does it properly? Somebody warned me about the dependence between mean and variance for binomial dependent variables, that adjusting the variance could still lead to biased point estimates... but I have not found a discussion on that topic in the manual (so far).
I have tried to use xtgee as well, but I can not make it converge..

Enough questions for now, I hope someone out there can help!

/Louise Hiley

_________________________________________________________________
Bli f�r�lskad p� MSN Dejting http://www.msn.se/dejting/default.asp

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index