-----Messaggio originale----- Brendan wrote: I am working with a dataset containing 30000 observations. Some of the explanatory variables are continuous. If I perform usual tests for normality the numbers are too great for swilk or for sfrancia, and if I use sktest the result is "absurdly" large values and rejects the hypothesis of normal distribution. The frequency histogram, cumulative frequency plot and normal plot all look normal with no outliers. I presume that with such large numbers even very small deviations from normal will lead to a significant result. The box- tidwell test indicates that the model relationship is linear for all these continuous variables. Is it safe to ignore the sktest results? Regards Brendan * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ Dear Brendan, I think you are right. Quoting Svend Juul's (really helpful) textbook "An Introduction to Stata for Health Researchers" Stata Press, 2006: 110. "...Significant test for normality may, however, be misleading: With large datasets, even unimportant departures from normality becomes statistical significant, and the most important tool is visual inspection". HTH and Best Regards, Carlo * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

