Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: qnorm and ttest question

From	<[email protected]>
To	<[email protected]>
Subject	st: R: qnorm and ttest question
Date	Fri, 3 Feb 2012 10:44:10 +0100

Dear Stata,
Please take a look to Stata Faq #2. Point 3 ("Please note that many members
are less inclined to answer anonymous emails, sometimes to the point of
ignoring them on principle".)
As far as your query is concerned, I agree with David Hoaglin that with
20,000 observations you should not come across any probems with t-test.
However, another way may be to calculate a bootstrap p-value (please, see
the code below):
----------------------------code
begins--------------------------------------------
drop _all
set obs 100
g group=1 in 1/50
replace group=2 in 51/100
g worked_hour=(60*runiform())
replace worked_hour=100 in 45/47
by group, sort: swilk worked_hour
replace worked_hour=100 in 95/100
by group, sort: swilk worked_hour
ttest worked_hour, by(group) unequal
return list
scalar t=r(t)
g comb_mean=((r(mu_1)*r(N_1))+(r(mu_2)*r(N_2)))/(r(N_1)+r(N_2))
sum worked_hour if group==1, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==1
sum worked_hour if group==2, meanonly
replace worked_hour= worked_hour-r(mean)+comb_mean if group==2
bootstrap r(t), reps(10000) nodots strata(group)
saving(C:\Users\user\Desktop\bootstrap.dta, every(1) double replace)
seed(12345) : ttest worked_hour, by(group) unequal
use "C:\Users\user\Desktop\bootstrap.dta", clear
generate indicator = abs(t)>=abs(scalar(t))
summarize indicator, meanonly
display "p_bootstrap = " r(mean)
---------------------------------- Code
ends------------------------------------------------------------------------
-----------------------------

HTH and Kind Regards,
Carlo
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Stata
Inviato: giovedì 2 febbraio 2012 21:10
A: [email protected]
Oggetto: st: qnorm and ttest question

Hello,

I try to see the data for "total worked hour in the past week" is normal
distribution or not. I used qnorm and got a graph which most of dots fall
on/closed to the line but the left side tail is  above the line as
"worked-hour" is always non negative.

what should I say about this distribution?

I want to do ttest on 2 groups. Is it correct that they should be normal
distribution in order ttest result to be void? Can I apply CLT and assume
them as normal distribution as my sample is greater than 20,000? I have
tried the sktest and they did not pass the test.

Any advice on how to handle these problem?

Thanks.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: qnorm and ttest question
  - From: Stata <[email protected]>

Prev by Date: Re: st: How to drop low frequency patterns from panel data
Next by Date: Re: st: Exploratory factor analysis using a mix of categorical and continuous variables
Previous by thread: Re: st: qnorm and ttest question
Next by thread: st: problem with looping and saving macros
Index(es):
- Date
- Thread