Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Rankit, pearson and polychoric correlations [was: Ordinal tointerval assuming normality]

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Rankit, pearson and polychoric correlations [was: Ordinal tointerval assuming normality]
Date   Mon, 20 Oct 2003 19:07:38 +0100

Michael Ingre

> Thank you Nick Cox, your neat suggestion -almost- (I think)
> did the trick.
> > . sysuse auto
> > . ssc inst egenmore
> > . egen ridit = ridit(rep78)
> > . gen pseudogauss = invnorm(ridit)
> > . tabdisp rep78, c(ridit pseudogauss)
> But, wouldn't the estimation using ridit scores (mean rank
> / sample size)
> bias z slightly towards zero when you have many
> observations/category? That
> is, z-scores would increase more to the left than decrease
> to the right of
> the rank mean within a category (on the left hand size of
> the distribution).

The ridit transformation as implemented in -egen, ridit()-
is just one that has worked well for me in some exploratory
contexts. (Some references are in the help file for -distplot-.)
I have used it as a transformation procedure rather than an attempt
to estimate a latent quantity. So, you might want to modify it.
Feel free to copy the code and hack away.

> > Is this not (related to) the
> > rankit transformation of Fisher and Yates?
> Yes, I think you are right. And I tried to find out more
> about it however,
> this procedure is not mentioned in any textbook that I
> could find. Searching
> the net gave me a few hits. It is close to your suggestion:
> rankit = invnorm( (mean rank-0.375)/sample size+0.25) )

These constants are those thrown up by the 1958 analysis
of your fellow Swede Gunnar Blom (1920-2003). You can find other
in other discussions of plotting positions. By and large,
I suggest that this is a secondary issue unless sample size
is very small, in which case no white magic will help much.

Attempts to find highly exact solutions to problems posed by
highly inexact data always seem somewhat strained to me,
but a ultracynic might claim that to be a definition of
statistical science in general...

[email protected]

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index