Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: sign test output

 From "Seed, Paul" To "statalist@hsphsun2.harvard.edu" Subject Re: st: sign test output Date Fri, 18 Jan 2013 10:48:53 +0000

```Dear Statalist,

While the discussion of Nahla Betelmal's query has been interesting and
informative, one point seems to have been missed: the question is ill-defined.

It appears that Nahla Betelmal has a variable that she wants/expects
for good theoretical reasons to have an average of 0; and wants to test
if this is true.  We are not told any more.

If (s)he came to me for statistical advice, I would instantly want to know
- what the theoretical reasons were
- which average (the mean or the median) was expected to be 0
- how large a tolerance was acceptable
- what the implications would be if the average was not 0.
Until I had a clear understanding, I would not want start analysing data.

The second question is crucial.  For a seriously non-normal distribution,
the mean and the median can be quite different, and it is possible
to construct examples where the mean is significantly > 0, while
the median is significantly < 0.

Normality checks would be mainly graphical, for the reasons discussed;
but I might look at measures of skewness, kurtosis and in particular compare
whether the mean and median were sufficiently close for it not to matter which
I used.  (Estimates of the mean are usually more robust, so with low skew and
mean close to median, I might prefer to use the mean even if the median were
the main object of interest.)

Assuming interest was in the mean, I would advise one or more of
one-sample t-test  (quick simple, and usually sufficient)
linear regression with robust standard errors (a basic correction for non-Normality)
bootstrapped linear regression with BCa confidence intervals, (a fuller correction,
that can give asymmetrical CI where appropriate, e.g. in cases of extreme non-Normality).

All methods are well described in the Stata manual, and usually give very similar answers
(except for extreme cases of non-Normality).

If interest was in the median, and I didn't trust the Normal approximation,
I would use the -centile- command with the -cci- option to get a confidence interval
for the median.

In each case I would direct attention to the confidence interval, and to the question of whether
the answer was sufficiently close to 0  (As defined by the third question.)

All this assumes that the ultimate interest is in the answer to this question.
If it was just a preliminary to another analysis, or the answer was wanted for
some deduction that could be made from it, I would also look for other
ways of addressing the real question, whatever it might be.

On Jan 17, 2013, at 5:13, Nahla Betelmal <nahlaib@gmail.com> wrote:

>
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned. How can we determine which test (i.e. parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
>
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they don't!
>
> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.
>
> Thank you both again, I highly appreciate your kind help and time,
>
> Nahla
>
>

Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners
(+44) (0) 20 7188 3642.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```