Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: qnorm and ttest question

From   <>
To   <>
Subject   st: R: qnorm and ttest question
Date   Fri, 3 Feb 2012 10:44:10 +0100

Dear Stata,
Please take a look to Stata Faq #2. Point 3 ("Please note that many members
are less inclined to answer anonymous emails, sometimes to the point of
ignoring them on principle".)
As far as your query is concerned, I agree with David Hoaglin that with
20,000 observations you should not come across any probems with t-test.
However, another way may be to calculate a bootstrap p-value (please, see
the code below):
drop _all
set obs 100
g group=1 in 1/50
replace group=2 in 51/100
g worked_hour=(60*runiform())
replace worked_hour=100 in 45/47
by group, sort: swilk worked_hour
replace worked_hour=100 in 95/100
by group, sort: swilk worked_hour
ttest worked_hour, by(group) unequal
return list
scalar t=r(t)
g comb_mean=((r(mu_1)*r(N_1))+(r(mu_2)*r(N_2)))/(r(N_1)+r(N_2))
sum worked_hour if group==1, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==1
sum worked_hour if group==2, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==2
bootstrap r(t), reps(10000) nodots strata(group)
saving(C:\Users\user\Desktop\bootstrap.dta, every(1) double replace)
seed(12345) : ttest worked_hour, by(group) unequal
use "C:\Users\user\Desktop\bootstrap.dta", clear
generate indicator = abs(t)>=abs(scalar(t))
summarize indicator, meanonly
display "p_bootstrap = " r(mean)
---------------------------------- Code

HTH and Kind Regards,
-----Messaggio originale-----
[] Per conto di Stata
Inviato: giovedì 2 febbraio 2012 21:10
Oggetto: st: qnorm and ttest question


I try to see the data for "total worked hour in the past week" is normal
distribution or not. I used qnorm and got a graph which most of dots fall
on/closed to the line but the left side tail is  above the line as
"worked-hour" is always non negative.

what should I say about this distribution?

I want to do ttest on 2 groups. Is it correct that they should be normal
distribution in order ttest result to be void? Can I apply CLT and assume
them as normal distribution as my sample is greater than 20,000? I have
tried the sktest and they did not pass the test.

Any advice on how to handle these problem?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index