Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Right skewed (positive) dependent variable


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Right skewed (positive) dependent variable
Date   Thu, 10 Jun 2010 14:37:55 +0000 (GMT)

--- On Thu, 10/6/10, SURYADIPTA ROY wrote:
> My dependent variables are heavily right skewed, and
> originally a logarithmic transformation did not help with
> the normality of the (conditional) distribution of
> the residuals. I transformed the dependent variables
> using  - ladder - (indicate that it did a logarithmic
> transformation also) and after carrying out the regressions
> find that the residuals are normally distributed, -ovtest-
> and -linktest- do not indicate any unexplained variation,
> the level of significance in all cases are significantly
> higher, and the explanatory variable of interest is coming
> out to be strongly significant in most cases (all good 
> news!?) I am wondering if someone could suggest me as to how
> much faith I can have in the regressions, and why did my
> original transformation did not yield the same results as 
>- ladder- .

That is very odd, you should get the exactly the same variable:

*------ begin example ---------
sysuse citytemp, clear
ladder tempjuly, gen(sqrt1)
gen sqrt2 = sqrt(tempjuly)
assert sqrt1 == sqrt2
*------- end example ----------

Anyhow, if you want to interpret your results you are usually
much better of by using -glm- with the appropriate -link()-
option (in your case -link(log)-), than transforming your
dependent variable. The whole logic behind regression is that
you want to look how the mean of your dependent variable differes
across values of your independent variables. If you (non-linearly)
transform your dependent variable, then you are no longer looking
at the mean of your dependent variable, and back transforming
won't work either. I nice discussion can be found here:

Nicholas J. Cox, Jeff Warburton, Alona Armstrong, Victoria J. Holliday 
(2007) "Fitting concentration and load rating curves with generalized
linear models" Earth Surface Processes and Landforms, 33(1):25--39.
<http://www3.interscience.wiley.com/journal/114281617/abstract>

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index