[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: ttest or xtmelogit?

From   Steven Samuels <>
Subject   Re: st: ttest or xtmelogit?
Date   Tue, 11 Mar 2008 15:10:12 -0400

Not mentioned in -transint- is the variance-stabilizing property of the angular transformation: it has asymptotic variance 1/4n, which is not a function of p (Anscombe, 1948). If the observed proportion is r/ n, Anscombe showed that the arcsine of [(r + 3/8)/(n + 3/4)]^.5 is even better at stabilizing the variance, for moderate sample size. The second version has variance 1/(4n + 2).

The arcsine-transformation used to be recommended because transformed proportions could be analyzed via standard ANOVA programs. I once found it useful in a variance components analysis. The 'error' variance was a mixture of a between-sample and within sample (binomial) variance. With the arcsine transformation, I could subtract out the part attributable to binomial variation.


FJ Anscombe 1948. The transformation of Poisson, Binomial, and negative-binomial data. Biometrika 35:246-254
On Mar 10, 2008, at 6:02 PM, Nick Cox wrote:

By arcsin I guess you mean the angular transformation (arcsine of square
Its use seems to have faded dramatically in recent years.

Tukey showed that this is very close to p^0.41 - (1 - p)^0.41. That
makes it weaker
than the logit. My guess is that it would be an unusual dataset in which
the angular
was much better than leaving data as is and also much better than the
logit. It could happen,
but it seems to be rare.

The Tukey reference is given in -transint- from SSC.


David Airey

Maybe I should not have said it was pilot data! I won't disagree, but
when cluster number is too small (< 20) to invoke xtgee or xtmelogit
on the observed yes/no data, or glm on the summary statistics with
binomial family and logit link, what do you do? It seems to me there
is a sample size between 10 and 30 clusters of yes/no data that may be
better suited to some of the older approaches like arcsin transformed
proportions and then ttest or ANOVA/regress. I guess that was my

* For searches and help try:
*   For searches and help try:

© Copyright 1996–2019 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index