-----Messaggio originale-----
Brendan wrote:
I am working with a dataset containing 30000 observations. Some of  
the explanatory variables are continuous. If I perform usual tests  
for normality the numbers are too great for swilk or for sfrancia,  
and if I use sktest the result is "absurdly" large values and rejects  
the hypothesis of normal distribution. The frequency histogram,  
cumulative frequency plot and normal plot all look normal with no  
outliers. I presume that with such large numbers even very small  
deviations from normal will lead to a significant result. The box- 
tidwell test indicates that the model relationship is linear for all  
these continuous variables. Is it safe to ignore the sktest results?
Regards
Brendan
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Dear Brendan,
I think you are right.
Quoting Svend Juul's (really helpful) textbook "An Introduction to Stata for
Health Researchers" Stata Press, 2006: 110. "...Significant test for
normality may, however, be misleading: With large datasets, even unimportant
departures from normality becomes statistical significant, and the most
important tool is visual inspection".
HTH and Best Regards,
Carlo
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/