Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ladder question for right-skewed variable


From   Gabriel Nelson <[email protected]>
To   statalist <[email protected]>
Subject   Re: st: ladder question for right-skewed variable
Date   Fri, 26 Apr 2013 13:57:16 -0700

Thanks very much for your suggestions Nick. It makes sense that the
problem might lie within -sktest-. I won't worry any more about this
problem and just proceed with the qnorm command, as you suggested.
Thanks again.

Gabriel

On Fri, Apr 26, 2013 at 11:45 AM, Nick Cox <[email protected]> wrote:
> Three assertions based on a mix of experience and prejudice:
>
> 1. The best way to check for normality is with -qnorm-. Even if
> normality is not your reference case, asymmetry will show up clearly
> on a -qnorm- graph.
>
> 2. 90% of the time, choosing transformations boils down to whether
> three possible transformations are any use, root, logarithm or
> reciprocal.
>
> 3. So, do-it-yourself is easy:
>
> gen rtmyvar = sqrt(myvar)
> gen logmyvar = log(myvar)
> gen recmyvar = 1/myvar
>
> qnorm myvar, name(a)
> qnorm rtmyvar, name(b)
> qnorm logmyvar, name(c)
> qnorm recmyvar, name(d)
>
> Not universally known fact: Giving a name to a graph means that it
> sticks around until _you_ close it. So, you have four graphs on your
> monitor. Arrange them with your mouse so you can compare. Usually it's
> easy to pick what works best, without any formal machinery.
>
> (Yes, I know about -gladder-, but this is simpler in practice.)
>
>
> Nick
> [email protected]
>
>
> On 26 April 2013 19:20, Nick Cox <[email protected]> wrote:
>> Just to underline that kurtosis in your variable was calculated by
>> -summarize- 108. That's BIG. No wonder -sktest- can't cope.
>> Nick
>> [email protected]
>>
>>
>> On 26 April 2013 19:17, Nick Cox <[email protected]> wrote:
>>> That's not quite "no transformations appeared in the output" as
>>> -ladder- is signalling P-values for some cases.
>>>
>>> But I readily agree that -ladder- is not doing a good job here at all.
>>>
>>> In fact, I am now reminded of evident -ladder- problems shown in a
>>> recent thread starting at
>>> http://www.stata.com/statalist/archive/2013-02/msg00862.html
>>>
>>> I can't find a public email, even though I thought I posted on this,
>>> but my impression from looking at the code is that -ladder- is
>>> essentially fragile. The real problem here is within -sktest-. It can
>>> break down, it seems, for large sample sizes and/or large deviations
>>> from Gaussianity. Then it bounces back missings.
>>>
>>> I think you just need to abandon -ladder-. It's not essential. You
>>> don't need _any_ test to tell you that some transformation will help
>>> if the goal is to reduce asymmetry, and there are only a few credible
>>> alternatives.
>>>
>>> As David and I pointed out, log transformation should work quite well
>>> for your data,
>>>
>>> but but but: (my suggestion; David may not agree) why transform at
>>> all? Your solutions start with -poisson- (or, for consenting adults,
>>> -nbreg-).
>>>
>>> BTW, -ladder- is a command, not a function, and in Stata ne'er the
>>> twain shall meet.
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 26 April 2013 18:55, Gabriel Nelson <[email protected]> wrote:
>>>> Thanks Nick, yes exactly, my question is why the ladder function fails
>>>> to provide any chi-square values here. I'll attach the Stata output
>>>> here:
>>>>
>>>> . ladder disp_2000
>>>>
>>>> Transformation         formula               chi2(2)       P(chi2)
>>>> ------------------------------------------------------------------
>>>> cubic                  dis~2000^3                 .            .
>>>> square                 dis~2000^2                 .            .
>>>> identity               dis~2000                   .            .
>>>> square root            sqrt(dis~2000)             .        0.000
>>>> log                    log(dis~2000)              .        0.000
>>>> 1/(square root)        1/sqrt(dis~2000)           .        0.000
>>>> inverse                1/dis~2000                 .        0.000
>>>> 1/square               1/(dis~2000^2)             .        0.000
>>>> 1/cubic                1/(dis~2000^3)             .        0.000
>>>>
>>>> . sum disp_2000, detail
>>>>
>>>>       Number displaced 2000 (if data unavailable go up
>>>>                            to 2003
>>>> -------------------------------------------------------------
>>>>       Percentiles      Smallest
>>>>  1%            1              1
>>>>  5%            2              1
>>>> 10%            3              1       Obs                1010
>>>> 25%            6              1       Sum of Wgt.        1010
>>>>
>>>> 50%         15.5                      Mean           281.5297
>>>>                         Largest       Std. Dev.      1217.168
>>>> 75%           82           9421
>>>> 90%        436.5           9505       Variance        1481497
>>>> 95%         1251          16255       Skewness       9.012044
>>>> 99%         5953          19569       Kurtosis       108.8061
>>>>
>>>> On Fri, Apr 26, 2013 at 10:47 AM, Nick Cox <[email protected]> wrote:
>>>>> Please see my answers too. You have still not given the exact -ladder-
>>>>> command you used or its output, so it is really difficult to know what
>>>>> is going on.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/



-- 
Gabriel Nelson
Doctoral Candidate
Dept. of Sociology
University of California- Los Angeles
http://www.soc.ucla.edu/people/graduate-student?lid=4344
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index