Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Normality Testing

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Re: Normality Testing
Date   Wed, 11 Feb 2004 12:33:29 -0000

I can't see your variable to comment but these
results don't surprise me. 

If you 

sysuse auto
foreach v of var price-gear { 
	qui swilk `v' if foreign
	di "`v' {col 20}" %4.3f r(p) 

you will get this: 

price              0.004
mpg                0.495
rep78              0.293
headroom           0.940
trunk              0.809
weight             0.026
length             0.813
turn               0.996
displacement       0.083
gear_ratio         0.013

If you then follow up, as you did,  
with say -qnorm- then -- even 
with a sample size this low, 22, 
chosen to be of the same order as 
your example -- you will see that 
a low P-value can correspond to 
variables which look as if they should
be transformed and variables which, 
to be sure, don't look exactly normal 
but would probably not be problematic
for -anova-. In short "looks as if it
isn't normal" is not the same as "looks
as if it would be problematic". 

In any case I would put more emphasis
on choosing response scale on scientific
or substantive grounds than because of this
normality assumption (which, additionally, 
is about errors, not responses). The 
manual entry [R] diagnostic plots points
to Rupert Miller's book, which is excellent 
reading for this area. 

One of many merits of -glm- is that it lets you decouple the 
question of response scale and error distribution. 

[email protected] 

Karamjit Shad
> Prior to carrying out an anova I tested my data for normality 
> and some of 
> the data was non-normal. Ladder suggested a log 
> transformation would be 
> suitable. I then checked the transformed data using swilk and 
> the data is 
> still non-normal. However sfrancia indicates that it is normal.
> . swilk igg60 if group==3
>                    Shapiro-Wilk W test for normal data
>     Variable |    Obs        W          V          z     Prob>z
> -------------+-------------------------------------------------
>        igg60 |     30    0.74827      8.001      4.300  0.00001
> . swilk ligg60 if group==3
>                    Shapiro-Wilk W test for normal data
>     Variable |    Obs        W          V          z     Prob>z
> -------------+-------------------------------------------------
>       ligg60 |     30    0.91745      2.624      1.995  0.02305
> . sfrancia ligg60 if group==3
>                   Shapiro-Francia W' test for normal data
>     Variable |    Obs        W'         V'         z     Prob>z
> -------------+-------------------------------------------------
>       ligg60 |     30    0.93170      2.398      1.600  0.05479
> a qnorm plot shows the data to "gently" oscillate about the normal 
> distribution but nothing that would worry me too much.
> My question is what test should I use for testing for 
> normality in this 
> situation - or should I just use a non-parametric analysis.

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index