Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: qnorm and ttest question


From   <carlo.lazzaro@tiscalinet.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: R: qnorm and ttest question
Date   Fri, 3 Feb 2012 10:44:10 +0100

Dear Stata,
Please take a look to Stata Faq #2. Point 3 ("Please note that many members
are less inclined to answer anonymous emails, sometimes to the point of
ignoring them on principle".)
As far as your query is concerned, I agree with David Hoaglin that with
20,000 observations you should not come across any probems with t-test.
However, another way may be to calculate a bootstrap p-value (please, see
the code below):
----------------------------code
begins--------------------------------------------
drop _all
set obs 100
g group=1 in 1/50
replace group=2 in 51/100
g worked_hour=(60*runiform())
replace worked_hour=100 in 45/47
by group, sort: swilk worked_hour
replace worked_hour=100 in 95/100
by group, sort: swilk worked_hour
ttest worked_hour, by(group) unequal
return list
scalar t=r(t)
g comb_mean=((r(mu_1)*r(N_1))+(r(mu_2)*r(N_2)))/(r(N_1)+r(N_2))
sum worked_hour if group==1, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==1
sum worked_hour if group==2, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==2
bootstrap r(t), reps(10000) nodots strata(group)
saving(C:\Users\user\Desktop\bootstrap.dta, every(1) double replace)
seed(12345) : ttest worked_hour, by(group) unequal
use "C:\Users\user\Desktop\bootstrap.dta", clear
generate indicator = abs(t)>=abs(scalar(t))
summarize indicator, meanonly
display "p_bootstrap = " r(mean)
---------------------------------- Code
ends------------------------------------------------------------------------
-----------------------------

HTH and Kind Regards,
Carlo
-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Stata
Inviato: giovedì 2 febbraio 2012 21:10
A: statalist@hsphsun2.harvard.edu
Oggetto: st: qnorm and ttest question

Hello,

I try to see the data for "total worked hour in the past week" is normal
distribution or not. I used qnorm and got a graph which most of dots fall
on/closed to the line but the left side tail is  above the line as
"worked-hour" is always non negative.

what should I say about this distribution?

I want to do ttest on 2 groups. Is it correct that they should be normal
distribution in order ttest result to be void? Can I apply CLT and assume
them as normal distribution as my sample is greater than 20,000? I have
tried the sktest and they did not pass the test.

Any advice on how to handle these problem?

Thanks.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index