[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
Re: st: Log Transform Justification |

Date |
Thu, 30 Aug 2007 17:58:19 +0100 |

Austin Nichols has already picked up the main point, about logs and logits. I have two further comments. 1. I find it helpful to keep straight the distinction between _transformations_ and _link functions_, the latter jargon particular to generalised linear models literature but not exclusive to it. For example, a classic logit model with binary outcomes 0 and 1 does not transform the response, nor could it do so, as logit 0 and logit 1 are both indeterminate. Rather the point is that the mean response is modelled to vary between 0 and 1 (but can assume neither of those limits). More generally, in -glm- the link function is not strictly a transformation. 2. I don't have any general recipe for responses on (0,1), [0,1], (0,1] or [0,1) any more than I do for responses on any other supports (apart from always plot your data!). As usual there is a range of models with varying assumptions and some experience and some prejudices about how they work under departures from the optimum. As a joint author of -betafit- I have some affection for beta models, but affection gets you nowhere in this field. It is explicit that -betafit- ignores exact 0s and 1s and so it should be obvious that it is quite inappropriate whenever they occur. Other procedures do appear to work better in those circumstances. The trickiest circumstance appears to be whenever there is a spike at 0 or 1 or both in the distribution. For example, if the response is fraction of assets held in savings accounts, then presumably lots of people have no savings accounts and will score exact zeros if they are part of the sample. This comes up repeatedly on the list. It is fascinating to observe the range of attitudes, including those who appear to assume that there must a transformation that will somehow fix this, those who say, "Just leave them out", and those who are convinced that the answer must be a two-part model. Either way, seeking a panacea is not a good idea. The science of what you are doing has to have the first call. Nick [email protected] Clive Nicholas > Nick Cox wrote: > > > Some confusion here between logarithms and logits? > > If I'm thinking straight, you're arguing that the call to -glm, > link(logit)- only makes sense for models whose dependent variables are > already scaled 0-1, since the -link()- option does the transformation. > That's certainly what you suggested to me here: > > http://www.stata.com/statalist/archive/2007-01/msg00315.html > > I feel a Statalist sequel coming on. I've just finished re-fitting a > batch of fractional logit models to voting intention data after > discovering that I had log-transformed the dependent variables when it > wasn't necessary, largely due to the re-reading of the above post! > > If this is so, which is the most appropriate Stata routine with which > to fit an LT-OLS regression model? Note that not everybody in my field > thinks this to be a good idea, anyway; indeed Paolino's (2001) > extensive Monte Carlo tests found that such models come off third best > against pure OLS and beta-distributed models in terms of bias, > efficiency and 'overconfidence', and across a range of distributions > to boot. It was this paper that encouraged me to move away from such > an approach. > > Paolino P (2001) "Maximum Likelihood Estimation of Models with > Beta-Distributed Dependent Variables", Political Analysis 9(4): > 325-46. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Time Series/ arima postestimation- How to forecast morethan one-step-ahead?** - Next by Date:
**st: RE: identifying observations in one variable that appear in another variable** - Previous by thread:
**Re: st: Log Transform Justification** - Next by thread:
**st: Ologit and incidental parameter problem** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |