Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Right skewed (positive) dependent variable

 From SURYADIPTA ROY To statalist@hsphsun2.harvard.edu Subject Re: st: Right skewed (positive) dependent variable Date Thu, 10 Jun 2010 11:31:03 -0400

```Maarten,
Thank you very much indeed! I will definitely explore what you have
suggested. I was also surprised to see the different results, since
the transformed variables should be similar. However, as I look at my
program now, I discover the source of the anomaly- my transformatrion
was newvar=ln(1+oldvar).. that explains.

Thanks again,

On Thu, Jun 10, 2010 at 10:37 AM, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
> --- On Thu, 10/6/10, SURYADIPTA ROY wrote:
>> My dependent variables are heavily right skewed, and
>> originally a logarithmic transformation did not help with
>> the normality of the (conditional) distribution of
>> the residuals. I transformed the dependent variables
>> using  - ladder - (indicate that it did a logarithmic
>> transformation also) and after carrying out the regressions
>> find that the residuals are normally distributed, -ovtest-
>> and -linktest- do not indicate any unexplained variation,
>> the level of significance in all cases are significantly
>> higher, and the explanatory variable of interest is coming
>> out to be strongly significant in most cases (all good
>> news!?) I am wondering if someone could suggest me as to how
>> much faith I can have in the regressions, and why did my
>> original transformation did not yield the same results as
>
> That is very odd, you should get the exactly the same variable:
>
> *------ begin example ---------
> sysuse citytemp, clear
> gen sqrt2 = sqrt(tempjuly)
> assert sqrt1 == sqrt2
> *------- end example ----------
>
> Anyhow, if you want to interpret your results you are usually
> much better of by using -glm- with the appropriate -link()-
> dependent variable. The whole logic behind regression is that
> you want to look how the mean of your dependent variable differes
> across values of your independent variables. If you (non-linearly)
> transform your dependent variable, then you are no longer looking
> at the mean of your dependent variable, and back transforming
> won't work either. I nice discussion can be found here:
>
> Nicholas J. Cox, Jeff Warburton, Alona Armstrong, Victoria J. Holliday
> (2007) "Fitting concentration and load rating curves with generalized
> linear models" Earth Surface Processes and Landforms, 33(1):25--39.
> <http://www3.interscience.wiley.com/journal/114281617/abstract>
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```