Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sign test output

From   "Seed, Paul" <>
To   "" <>
Subject   Re: st: sign test output
Date   Fri, 18 Jan 2013 10:48:53 +0000

Dear Statalist, 

While the discussion of Nahla Betelmal's query has been interesting and 
informative, one point seems to have been missed: the question is ill-defined.

It appears that Nahla Betelmal has a variable that she wants/expects 
for good theoretical reasons to have an average of 0; and wants to test 
if this is true.  We are not told any more.

If (s)he came to me for statistical advice, I would instantly want to know 
	- what the theoretical reasons were
	- which average (the mean or the median) was expected to be 0
	- how large a tolerance was acceptable 
	- what the implications would be if the average was not 0.
Until I had a clear understanding, I would not want start analysing data.

The second question is crucial.  For a seriously non-normal distribution, 
the mean and the median can be quite different, and it is possible 
to construct examples where the mean is significantly > 0, while 
the median is significantly < 0.  

Normality checks would be mainly graphical, for the reasons discussed; 
but I might look at measures of skewness, kurtosis and in particular compare 
whether the mean and median were sufficiently close for it not to matter which 
I used.  (Estimates of the mean are usually more robust, so with low skew and 
mean close to median, I might prefer to use the mean even if the median were
the main object of interest.)

Assuming interest was in the mean, I would advise one or more of
	one-sample t-test  (quick simple, and usually sufficient)
	linear regression with robust standard errors (a basic correction for non-Normality)
	bootstrapped linear regression with BCa confidence intervals, (a fuller correction, 
		that can give asymmetrical CI where appropriate, e.g. in cases of extreme non-Normality).

All methods are well described in the Stata manual, and usually give very similar answers
(except for extreme cases of non-Normality).

If interest was in the median, and I didn't trust the Normal approximation, 
I would use the -centile- command with the -cci- option to get a confidence interval 
for the median.

In each case I would direct attention to the confidence interval, and to the question of whether
the answer was sufficiently close to 0  (As defined by the third question.)

All this assumes that the ultimate interest is in the answer to this question.
If it was just a preliminary to another analysis, or the answer was wanted for 
some deduction that could be made from it, I would also look for other 
ways of addressing the real question, whatever it might be.

On Jan 17, 2013, at 5:13, Nahla Betelmal <> wrote:

> Again, thank you both for your comments.
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned. How can we determine which test (i.e. parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they don't!
> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.
> Thank you both again, I highly appreciate your kind help and time,
> Nahla

Paul T Seed, Senior Lecturer in Medical Statistics, 
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners 
(+44) (0) 20 7188 3642.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index