Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sign test output

From	Maarten Buis <[email protected]>
To	[email protected]
Subject	Re: st: sign test output
Date	Thu, 17 Jan 2013 14:56:01 +0100

On Thu, Jan 17, 2013 at 2:13 PM, Nahla Betelmal wrote:
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned.

I would argue that they are not useful under any circumstance. In
small samples the asymptotics has not kicked in yet and the p-values
don't mean what you think they mean, and in large samples these tests
will detect meaningless deviations from Gaussianity.

> How can we determine which test (i.e parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
>
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they dont!

You need to start with determining the exact null hypothesis you want
to test. The null hypotheses for the t-test and the sign rank test are
not the same, so why would you expect them to lead to the same
conclusion? There are reasonable scenarios where this is true, and
there are equally reasonable scenarios where this is not true. So that
is one part of the answer: Know exactly what you want to test, and
compare that with what each test tests

The second part is that what you want to test needs to be testable
with the data you have, so you need to know your data. If it is real
data than there will always be problems. The question is, are those
problems big enough to cause trouble. That can only be a judgement
call made by you.

So, unfortunately there is no cookbook style recipe you can follow
when doing research.

> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.

No, there really is no alternative to knowing your data and knowing
your tests followed by making an informed judgement call.

Just to increase the number of options open to you, you don't have to
choose between t-test and sign rank test, you can also compute the
Achieved Significance Level for a t-test. This is a technique related
to the bootstrap. There is a discussion on that in the manual entry of
-bootstrap- and there are also references there if you want to read
more about it. Below I have addepted that example to the one sided one
sample t-test that you seem to want to do.

*------------------ begin example ------------------
sysuse auto, clear

// price does not follow a Gaussian (normal) distribution
qnorm price

// still use t-test to one sided test whether the mean price is
// $5,500 and Ha mean price > $5,500
// I am not testing whether the mean price is zero as that would
// not make sense for this variable
ttest price = 5500

// store the t-value
tempname t
scalar `t' = r(t)

// recenter the mean such that the null hypothesis is true
sum price, meanonly
gen double cprice = price - ( r(mean) - 5500 )
sum cprice

// there is randomness involved in bootstrap, so for reproducability
// set the seed
set seed 123456

// bootstrap t-test when H0 is true and store
// the t-values in a dataset (this takes a while)
tempfile bsdata
qui bootstrap t=r(t), reps(20000) saving(`bsdata') nodots : ///
    ttest cprice = 5500

// compute the ASL
use `bsdata', clear
count if t > `t'
#delimit ;
di as txt "The achieved siginificance level (ASL) is: "
   as result  %6.4f r(N)/_N ;
#delimit cr

// there is randomness involved in the bootstrap, so if we were to
// repeat this we would get a (slightly) different ASL
// If we were to repeat this computation a 100 times (without
// setting the seed)than we would expect 95 of these the return
// an ASL between 0.0119 and 0.0152
cii _N r(N)

// So the p-value returned by the t-test (.0281) seems to be a bit
// too large
*------------------- end example -------------------
* (For more on examples I sent to the Statalist see:
* http://www.maartenbuis.nl/example_faq )

-- Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: sign test output
  - From: Nahla Betelmal <[email protected]>
- Re: st: sign test output
  - From: Maarten Buis <[email protected]>
- Re: st: sign test output
  - From: Nick Cox <[email protected]>
- Re: st: sign test output
  - From: Nahla Betelmal <[email protected]>
- Re: st: sign test output
  - From: Nick Cox <[email protected]>
- Re: st: sign test output
  - From: Nahla Betelmal <[email protected]>
- Re: st: sign test output
  - From: Maarten Buis <[email protected]>
- Re: st: sign test output
  - From: Nick Cox <[email protected]>
- Re: st: sign test output
  - From: Nahla Betelmal <[email protected]>

Prev by Date: Re: st: sign test output
Next by Date: Re: st: sign test output
Previous by thread: Re: st: sign test output
Next by thread: Re: st: sign test output
Index(es):
- Date
- Thread