Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: sign test output
Maarten Buis <email@example.com>
Re: st: sign test output
Thu, 17 Jan 2013 14:56:01 +0100
On Thu, Jan 17, 2013 at 2:13 PM, Nahla Betelmal wrote:
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned.
I would argue that they are not useful under any circumstance. In
small samples the asymptotics has not kicked in yet and the p-values
don't mean what you think they mean, and in large samples these tests
will detect meaningless deviations from Gaussianity.
> How can we determine which test (i.e parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they dont!
You need to start with determining the exact null hypothesis you want
to test. The null hypotheses for the t-test and the sign rank test are
not the same, so why would you expect them to lead to the same
conclusion? There are reasonable scenarios where this is true, and
there are equally reasonable scenarios where this is not true. So that
is one part of the answer: Know exactly what you want to test, and
compare that with what each test tests
The second part is that what you want to test needs to be testable
with the data you have, so you need to know your data. If it is real
data than there will always be problems. The question is, are those
problems big enough to cause trouble. That can only be a judgement
call made by you.
So, unfortunately there is no cookbook style recipe you can follow
when doing research.
> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.
No, there really is no alternative to knowing your data and knowing
your tests followed by making an informed judgement call.
Just to increase the number of options open to you, you don't have to
choose between t-test and sign rank test, you can also compute the
Achieved Significance Level for a t-test. This is a technique related
to the bootstrap. There is a discussion on that in the manual entry of
-bootstrap- and there are also references there if you want to read
more about it. Below I have addepted that example to the one sided one
sample t-test that you seem to want to do.
*------------------ begin example ------------------
sysuse auto, clear
// price does not follow a Gaussian (normal) distribution
// still use t-test to one sided test whether the mean price is
// $5,500 and Ha mean price > $5,500
// I am not testing whether the mean price is zero as that would
// not make sense for this variable
ttest price = 5500
// store the t-value
scalar `t' = r(t)
// recenter the mean such that the null hypothesis is true
sum price, meanonly
gen double cprice = price - ( r(mean) - 5500 )
// there is randomness involved in bootstrap, so for reproducability
// set the seed
set seed 123456
// bootstrap t-test when H0 is true and store
// the t-values in a dataset (this takes a while)
qui bootstrap t=r(t), reps(20000) saving(`bsdata') nodots : ///
ttest cprice = 5500
// compute the ASL
use `bsdata', clear
count if t > `t'
di as txt "The achieved siginificance level (ASL) is: "
as result %6.4f r(N)/_N ;
// there is randomness involved in the bootstrap, so if we were to
// repeat this we would get a (slightly) different ASL
// If we were to repeat this computation a 100 times (without
// setting the seed)than we would expect 95 of these the return
// an ASL between 0.0119 and 0.0152
cii _N r(N)
// So the p-value returned by the t-test (.0281) seems to be a bit
// too large
*------------------- end example -------------------
* (For more on examples I sent to the Statalist see:
* http://www.maartenbuis.nl/example_faq )
Maarten L. Buis
* For searches and help try: