Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: qnorm and ttest question |

Date |
Thu, 2 Feb 2012 18:49:13 -0500 |

I have not seen the plot or any summary statistics for your data, but the pattern that you describe in the plot indicates some skewness, with the left tail being lighter than that of a normal distribution. For most distributions of data, the CLT takes hold at fairly small sample sizes. You should not have a problem with the t-test. But why stop there? With such a large sample size, you could compare the distributions in the two groups in considerable detail. One graphical approach would use an "empirical Q-Q plot" (an analog of a normal probability plot in which the points are the corresponding quantiles in the two samples). David Hoaglin > I try to see the data for "total worked hour in the past week" is > normal distribution or not. I used qnorm and got a graph which most of > dots fall on/closed to the line but the left side tail is above the > line as "worked-hour" is always non negative. > > what should I say about this distribution? > > I want to do ttest on 2 groups. Is it correct that they should be > normal distribution in order ttest result to be void? Can I apply CLT > and assume them as normal distribution as my sample is greater than > 20,000? I have tried the sktest and they did not pass the test. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: qnorm and ttest question***From:*Stata <stataq@gmail.com>

- Prev by Date:
**st: st: generating lag variable in a Panel Dataset** - Next by Date:
**Re: st: RE: One v. two-step ECMs** - Previous by thread:
**st: qnorm and ttest question** - Next by thread:
**st: R: qnorm and ttest question** - Index(es):