Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sign test output

From	"Seed, Paul" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: sign test output
Date	Fri, 18 Jan 2013 10:48:53 +0000

Dear Statalist, 

While the discussion of Nahla Betelmal's query has been interesting and 
informative, one point seems to have been missed: the question is ill-defined.

It appears that Nahla Betelmal has a variable that she wants/expects 
for good theoretical reasons to have an average of 0; and wants to test 
if this is true.  We are not told any more.

If (s)he came to me for statistical advice, I would instantly want to know 
	- what the theoretical reasons were
	- which average (the mean or the median) was expected to be 0
	- how large a tolerance was acceptable 
	- what the implications would be if the average was not 0.
Until I had a clear understanding, I would not want start analysing data.

The second question is crucial.  For a seriously non-normal distribution, 
the mean and the median can be quite different, and it is possible 
to construct examples where the mean is significantly > 0, while 
the median is significantly < 0.  

Normality checks would be mainly graphical, for the reasons discussed; 
but I might look at measures of skewness, kurtosis and in particular compare 
whether the mean and median were sufficiently close for it not to matter which 
I used.  (Estimates of the mean are usually more robust, so with low skew and 
mean close to median, I might prefer to use the mean even if the median were
the main object of interest.)

Assuming interest was in the mean, I would advise one or more of
	one-sample t-test  (quick simple, and usually sufficient)
	linear regression with robust standard errors (a basic correction for non-Normality)
	bootstrapped linear regression with BCa confidence intervals, (a fuller correction, 
		that can give asymmetrical CI where appropriate, e.g. in cases of extreme non-Normality).

All methods are well described in the Stata manual, and usually give very similar answers
(except for extreme cases of non-Normality).

If interest was in the median, and I didn't trust the Normal approximation, 
I would use the -centile- command with the -cci- option to get a confidence interval 
for the median.

In each case I would direct attention to the confidence interval, and to the question of whether
the answer was sufficiently close to 0  (As defined by the third question.)

All this assumes that the ultimate interest is in the answer to this question.
If it was just a preliminary to another analysis, or the answer was wanted for 
some deduction that could be made from it, I would also look for other 
ways of addressing the real question, whatever it might be.

On Jan 17, 2013, at 5:13, Nahla Betelmal <[email protected]> wrote:

> Again, thank you both for your comments.
> 
> However, if normality test is proved to be useful only for huge sample
> as Maarten mentioned. How can we determine which test (i.e. parametric
> or non-parametric ) to be used for smaller sample size in hundreds?!
> 
> I personally think it is irrational to run both t-test and sign test
> on the same sample and hope they both produce the same conclusion! and
> what if they don't!
> 
> I will follow Nick's advise to look deeper in the data, but I still
> believe that there must be another way to give obvious solution to
> this situation.
> 
> Thank you both again, I highly appreciate your kind help and time,
> 
> Nahla
> 
>

Paul T Seed, Senior Lecturer in Medical Statistics, 
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners 
(+44) (0) 20 7188 3642.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: sign test output
  - From: Nahla Betelmal <[email protected]>

Prev by Date: st: Modeling control variables (covariates) in SEMs: What is the correct approach?
Next by Date: Re: st: sign test output
Previous by thread: Re: st: sign test output
Next by thread: Re: st: sign test output
Index(es):
- Date
- Thread