Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: logarithmic scales

From   "Wallace, John" <>
To   "''" <>
Subject   RE: st: logarithmic scales
Date   Mon, 1 Dec 2003 11:41:03 -0800

One place where tiny p-values are important is in multiple-comparison tests,
where you're applying some sort of Bonferroni-like correction or venturing
into False Discovery rates and the like.  If you stack enough tests on top
of one another in a given analysis, its likely you'll meet that p-value
cutoff of significance purely by chance...

I've seen a fair number of presentations (powerpoint-"disabled" to boot)
with lecturers implying the thermonuclear detonation of the null hypothesis
(thanks for the image, Nick).  It seems to me that in such instances rather
than going after the null with such bloodthirsty vigour a better use of time
would be spent looking at the alternative hypotheses (i.e., the variable
accounts for no more than N% of the variance in the population measured)


-----Original Message-----
From: Roger Newson [] 
Sent: Monday, December 01, 2003 8:43 AM
Subject: RE: st: logarithmic scales

At 15:31 01/12/03 +0000, Nick Cox wrote:

>I'd assert, perhaps very rashly, that beyond
>some threshold, very low P-values are
>practically indistinguishable. I suppose that
>log P-value of -20 is often appealing as a kind of
>thermonuclear demolition of a null hypothesis, but I wonder
>if anyone would think differently of (say) -6. Also,
>as is well known, the further you go out into
>the tail the more you depend on everything being
>as it be (model assumptions, data without
>measurement error, numerical analysis...).
>On the other hand, there are situations
>in which an overwhelming P-value is needed
>for any ensuing decision.

A good discussion of this issue is given in Subsection 35.7 of Kirkwood and 
Sterne (2003), which is a basic text aimed mostly at non-mathematicians. 
This uses a Bayesian heuristic, based on the well-known result that the 
posterior odds between 2 hypotheses after the data analysis is equal to the 
prior odds between the same 2 hypotheses multiplied by the likelihood ratio 
between the 2 hypotheses. It is argued that a P-value below 0.003 is good 
enough for most of the people most of the time, because, *if* the prior 
odds are as bad as 100:1 against a nonzero population difference, *and* the 
power to detect a difference significant at P<=0.001 is as low as 0.5, 
*then* the posterior odds in favour of a nonzero population difference, 
given a P-value <=0.001, will be 5:1 in favour.

This heuristic seems to make sense to me, if the P-value is for the 
parameter of prior interest in the study design protocol, because not many 
grant-awarding bodies will pay for a study for which they consider the 
prior odds of an interesting difference to be worse than 100:1 against. On 
the other hand, in the real world, with today's technology, it is nearly 
always cheaper to torture the data until they confess than to collect more 
data. Therefore, a lot of people's colleagues expect them to do "subset 
analyses from hell", and are reluctant to write up negative results as 
such. Therefore, an honest scientist who wants to accumulate publications 
is often not a data miner, but a "data lawyer", cross-examoining the data 
on the moral equivalent of a "no-win no-fee" contract. Under these 
conditions, a lot of statistically-minded scientists will forget what they 
learned at college, and do what they are told, and torture the data. If the 
P-value is from one of a sequence of subset analyses, and is undertaken 
posterior to a main analysis which found nothing, then, arguably, the 
"prior odds" against an interesting difference might reasonably be worse 
than 100:1 against.



Kirkwood BR, Sterne JAC. Essential medical statistics. Second edition. 
Oxford, UK: Blackwell Science; 2003.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index