Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: analysis of continuous gestational age


From   Richard Goldstein <[email protected]>
To   [email protected]
Subject   Re: st: analysis of continuous gestational age
Date   Mon, 02 Apr 2007 09:23:51 -0400

There is *no* assumption in linear regression that the *data*
be normally distributed -- the estimates are unaffected by
this.  However, if you want to trust the confidence intervals
and/or the p-values, then the residuals must be normally
distributed.  There is no necessary relationship between the
distribution of the residuals and the distribution of the
data -- how else could one use "dummy" variables?

Rich Goldstein

Svend Juul wrote:
Alo wrote:

What about the idea that we can use linear regression even if the
residuals are
not normally distributed if we have a large dataset? Is there any basis
for
this?

-----------------------------------------------------------------------

No, not in the school I went to. You might be thinking of the fact that
with
large datasets even unimportant deviations from normality become
significant, so you should not use significance testing to decide
whether the deviation is important, but rather graphical inspection.

Regardless of dataset size: Gestational age data are not from a normal
distribution; they deviate a lot from that assumption.

Svend
__________________________________________

Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000 Aarhus C, Denmark
Phone: +45 8942 6090
Home: +45 8693 7796
Email: [email protected]
__________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index