Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: xtmelogit convergence issues and log transforming IVs


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: xtmelogit convergence issues and log transforming IVs
Date   Tue, 24 Jan 2012 21:05:57 +0000

Cube roots map 0 to 0.

If you want something stronger, log(x + 1) or asinh(x) could do.

Whatever you choose, you lose the interpretation of the coefficients
that fits nicely with logarithms.

Nick

On Tue, Jan 24, 2012 at 8:53 PM, William Hauser <whauseriii@gmail.com> wrote:

> I'm working with a dataset consisting of court cases nested within
> judges nested within circuit.  The model is specified as 2 levels
> (cases<judges) with circuit represented as 19 dummy variables (20
> circuits, 1 omitted as reference).  The outcome is dichotomous so I'm
> using the xtmelogit command.  Stata is version 12, intercooled.
>
> The problem is that the model simply will not converge unless I
> transform two of the predictor variables which are, in their
> untransformed form, highly overdispersed.  These variables represent
> the number of "points" the offender receives for their present offense
> and for their prior record if they have one (more on that shortly).
>
> Problem is, I'm not sure how to interpret the resulting odds ratios
> for the log transformed predictor variables (crime seriousness and
> prior record).
>
> Using the natural log, calculated as ln(xvar), I think the coefficient
> represents the change in odds for increasing x by a factor of ~2.7
> (the value of e).  This would seem to be very unintuitive if correct.
> Alternatively, I can use the log to base 1.10 of the x vars,
> calculated as ln(xvar)/ln(1.10), which I think might be interpreted as
> the odds ratio for a 10% change in x but I'm not at all sure.
>
> So, what is the correct/best transformation for this application and
> how do I interpret it?
>
> There is also the vexing issue the log of 0.  For crime seriousness
> there are no zeros since everyone committed a crime.  But for prior
> record there are those with no prior record.  One solution that seems
> to be roundly criticized is the addition of a constant such as .5 or 1
> to all cases before logging.  Another solution is to keep the 0's as 0
> and create a dummy coded as 1 for all cases with that 0 value (i.e.
> those that would've been undefined or ".").  The syntax for the latter
> solution looks like this,
>
> gen log_priors=(ln(prior_record)/ln(1.10)
> (a bunch of missing values result for all those cases where the
> offender has no prior record)
> replace log_priors=0 if prior_record==0
> gen no_priors=0
> replace no_priors=1 if prior_record==0
>
> Anyone know if this is an acceptable solution or if perhaps another
> transformation that is amenable to zeros is in order?
>
> Any insight or guidance would be greatly appreciated.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index