Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Log Transformation and GLM

From   "Maarten Buis" <>
To   <>
Subject   st: RE: Log Transformation and GLM
Date   Mon, 16 Jan 2006 08:53:09 +0100

Hi Chuck,

First, if you have a zero inflated dataset, and you take the log of the response variable, than you turned all 0s into missing values, and all 1s into 0s. That is probably a bad idea. Second, a log link means that you model the log rate at which events happen instead of log transforming the dependent variable. Third, it seems you have been doing some heavy duty data snooping, and some member of the statistical police won't approve of that.


Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214 

+31 20 5986715

-----Original Message-----
From: []On Behalf Of Charles Goss
Sent: maandag 16 januari 2006 5:29
Subject: st: Log Transformation and GLM

Hello Statalist,

I am having some issues analyzing data with glm.  I have tried several
methods to analyze my zero-inflated data set (zinb, hurdle and glm).
The best model fit that I get are when I log transform the response
variable prior to analysis with a glm model using a negative binomial
distribution.  The negative binomial uses a log link function, so I
think that this analysis is essentially double log-transforming the
data, once initially, and then when the response is linked to the
predictors it is log-transformed again.  I have not been able to find
any literature regarding this, so I was wondering if anyone knows if
this is an appropriate way to analyze these data?  Does it violate
assumptions of the glm??  Thanks for your time.


*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index