Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Log Normality of Dependentvar


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Log Normality of Dependentvar
Date   Mon, 8 Jun 2009 18:34:47 +0100

In addition, -transint- on SSC is a package containing a longish document which introduces transformations in Stata help file form. In my view, transformations are often very poorly explained, even in many textbooks. I was reckless enough to try to do better, for at least some audiences. 

Nick 
n.j.cox@durham.ac.uk 

sjsamuels@gmail.com

Look up references to the Box-Cox transformation.  That is what you
did when you ran -bcskew0- .  "bc"= "Box-Cox".  But I have to correct
my correction (You can see where this is heading!)  The original
Box-Cox transformation, implemented in -boxcox-, is the one that tries
to transform to normality.   -bcskew-, as its name implies, finds the
transformation of the same form which produces zero skewness.

On Mon, Jun 8, 2009 at 1:08 PM, Christian
Weiss<christian.weiss@nightberry.de> wrote:
> Hi Steven,
>
> thanks a lot for your explanation!
>
> Unforunately, it seems that something of oyur last message got cut off?
> Where can I find information on the "power transformation"? (google
> does not offer to much in that respect)
>
> Chris
>
>
> On Mon, Jun 8, 2009 at 6:55 PM, <sjsamuels@gmail.com> wrote:
>> the best fitting power transform to normality. But it is not relevant
>> to -swilk- with the lnnormal option, because the power transform may
>> not be a log (power =0) and the command does not subtract off a shift
>> parameter.
>>
>> -Steve
>>
>> On Mon, Jun 8, 2009 at 12:38 PM, <sjsamuels@gmail.com> wrote:
>>> -Chris--
>>>
>>> -lnskew0-- finds  by iteration a value of k for which y= ln(x - k) has
>>> skewness zero.  The manual implies that with the "lnnormal" option,
>>> -swilk- , estimates "k" by the method of -lnskew0-.  In fact, the ado
>>> file for -swilk- does not call -lnskew0-, but instead computes an
>>> approximation.. This probably accounts for the discrepancy that you
>>> observed.
>>>
>>> Analyses of  ln(var) and of the transformation  -bcskew0- are
>>> irrelevant to -swilk-, because the 'lnnormal" option considers the
>>> hypothesis of a three-parameter lognormal distribution.   I presume
>>> that by "skskew0"  you meant  "lnskew0
>>>
>>> -Steve
>>>
>>> On Mon, Jun 8, 2009 at 6:18 AM, Maarten buis<maartenbuis@yahoo.co.uk> wrote:
>>>>
>>>> --- On Mon, 8/6/09, Christian Weiss wrote:
>>>>> testing my dependent var via swilk or sfrancia rejects the
>>>>> Null Hypothesis of Normality.
>>>>
>>>> This is problematic for a number of reasons:
>>>>
>>>> 1) Regression never assumes that the dependent variable is
>>>> normally distributed, except when you have no explanatory
>>>> variables. It only assumes that the residuals are normally
>>>> distributed.
>>>>
>>>> 2) Testing for the normality of the residuals should only
>>>> be done once you are confinced that the other assumptions
>>>> have been met, as violations of the other assumptions are
>>>> likely to lead to residuals that look non-normal
>>>>
>>>> 3) The normality of the residuals is probably the least
>>>> important of the regression assumptions, as regression
>>>> is reasonably robust to violations of it.
>>>>
>>>> 4) Tests are probably not the best way to assess whether
>>>> the errors are normaly distributed. Graphical inspection
>>>> is usually more informative and powerful, see:
>>>> -help diagnostic plots- and -ssc d hangroot- for tools
>>>> to help with that.
>>>>
>>>> For a more general set of tools to perform post-estimation
>>>> checks of  regression assumptions see:
>>>> -help regress postestimation-.
>>>>
>>>>
>>>
>>> On Mon, Jun 8, 2009 at 5:38 AM, Christian
>>> Weiss<christian.weiss@nightberry.de> wrote:
>>>>
>>>> testing my dependent var via swilk or sfrancia rejects the Null
>>>> Hypothesis of Normality.
>>>> However, using the "lnnormal" option of swilk accepts the nully
>>>> hypothesis - it seems that the dependent variable is lognormal
>>>> distributed.
>>>>
>>>>
>>>> Suprisingly,after transformim my dependent variable by ln(var) or by
>>>> skskew0 / bcskew0, swilk still rejects the null hypothesis of
>>>> normality.
>>>>
>>>> How can that be explained?
>>>>
>>>> ..puzzled...Chris

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index