Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: R: Testing normality of a continuous predictor variable in a logistic model


From   "Carlo Lazzaro" <carlo.lazzaro@tin.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: R: Testing normality of a continuous predictor variable in a logistic model
Date   Tue, 27 Nov 2007 12:27:02 +0100

-----Messaggio originale-----
Brendan wrote:

I am working with a dataset containing 30000 observations. Some of  
the explanatory variables are continuous. If I perform usual tests  
for normality the numbers are too great for swilk or for sfrancia,  
and if I use sktest the result is "absurdly" large values and rejects  
the hypothesis of normal distribution. The frequency histogram,  
cumulative frequency plot and normal plot all look normal with no  
outliers. I presume that with such large numbers even very small  
deviations from normal will lead to a significant result. The box- 
tidwell test indicates that the model relationship is linear for all  
these continuous variables. Is it safe to ignore the sktest results?
Regards
Brendan
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Dear Brendan,

I think you are right.
Quoting Svend Juul's (really helpful) textbook "An Introduction to Stata for
Health Researchers" Stata Press, 2006: 110. "...Significant test for
normality may, however, be misleading: With large datasets, even unimportant
departures from normality becomes statistical significant, and the most
important tool is visual inspection".

HTH and Best Regards,

Carlo

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index