Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Re: xtmelogit convergence issues and log transforming IVs

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: Re: xtmelogit convergence issues and log transforming IVs Date Wed, 25 Jan 2012 20:48:18 +0000

```Two minor comments:

1. Normality of any predictor is not an assumption of regression-type models, although it does no harm. (Those striving for normality of predictors as the ideal need to explain whenever they also feel happy with using indicators, a.k.a. dummies.)

2. I don't know what forum(s) you are going to present to, but my reaction, right or wrong, is a likely one from many audiences, so I'd recommend making sure that your argument about 0 and 1 priors and the indicator is bullet-proof. It still seems strange to me, but I don't have a refutation. I'd want to do lots of exploratory analysis to get the functional dependence right.

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of William Hauser
Sent: 25 January 2012 20:24
To: statalist@hsphsun2.harvard.edu
Subject: st: Re: xtmelogit convergence issues and log transforming IVs

Nick,
You're right in that both 0 and 1 would be mapped to 0 by the log
transformation.  But only the cases with 0 on the original prior
record score would be coded as 1 on the dummy.  Thus, I think the
inclusion of both the log transformed variable and the dummy for no
priors would effectively parse the variation such that the log
transformed variable would represent the effect of increases in prior
record for those that have a prior record and the dummy would
represent the effect for those that do not.  I *think* the math holds
up, dummy variables  have similarly shared or overlapping 0's and
there is no issue there.

As to the base of the log, it does matter - the coefficients are very
different.  I think the ln transformed variable coefficient represents
the effect a 172% increase (i.e. an increase by a factor 2.72) which
is not at all intuitive as compared to the effect of a 10% increase
which is, I think, what you get with a log to base 1.1.  That is, if
my interpretation is correct.  On the other hand, both transformations
as well as a log to base 10 or base 2 bring the variables into
normality so from the model fit perspective I would agree that the
base of the log doesn't really matter.

Thanks for following up and clarifying,

Will Hauser
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```