Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: Correlation b/w independent variables in xtlogit

From   SamL <>
To   Stata Listserve <>
Subject   Re: st: Re: Correlation b/w independent variables in xtlogit
Date   Thu, 17 May 2007 07:05:50 -0700 (PDT)

Age and age-squared need not be correlated very highly.  If you calculate
agex-=age-mean_age, and then calculate agexsq=(agex*agex), then agex and
agexsq will be weakly correlated if at all.  Maybe that was already done
in the dataset you are using.


On Thu, 17 May 2007, Michael Blasnik wrote:

> ...
> I view this issue as more about interpretability of the coefficient(s).  You
> don't really need to worry about highly correlated terms, like age and age
> squared, if they are control variables in your model, but you would if either is
> the primary effect of interest.
> Models that include age and age squared are typically interested in controlling
> for age and want to allow for some nonlinearity.  The analyst is not usually
> trying to interpret either of the coefficients, but is actually interested in
> other coefficients in the model.  On the other hand, if you have highly
> correlated terms about the primary effects of interest, then collinearity can be
> a significant problem -- you can't really measure the effect of either of the
> correlated terms very well and the standard errors should show this.  How you
> should proceed depends on the subject matter -- you may want to think about
> somehow combining the two terms or you *could" just drop one.  If you drop a
> term, you need to think about how to interpret the remaining term since the
> coefficient now will include the effects of the dropped term as well.
> Michael Blasnik
> ----- Original Message -----
> From: "Alexandra Wilson" <>
> To: <>
> Sent: Thursday, May 17, 2007 1:05 AM
> Subject: st: Correlation b/w independent variables in xtlogit
> > Dear Statalisters.
> > I have a simple question: if the answer is well known to everyone but me,
> > apologies, but I am living in Tanzania where there is a dearth of
> > statisticians and stats books, and I have trawled the internet and the
> > statalist archives to no avail.
> > I am running a panel regression with a dichotomous variable using xtlogit.
> > I was getting strange (unexpected) results, and realized 2 of my independent
> > variables were highly correlated (correlation coefficient 0.92).  So I
> > omitted one and the results were much more in line with other tests.  But in
> > my list of independent variables I still have a variable for age (of panel
> > subject) and a variable for the square of age.  These 2 variables are, of
> > course, also highly correlated.  So why is it correct to leave both these
> > highly correlated variables in the regression, and yet to exclude the other
> > highly correlated variable?
> > Any enlightenment much appreciated.
> > Alexandra Wilson
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index