# RE: st: RE: New package -wridit- on SSC

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: RE: New package -wridit- on SSC Date Tue, 28 Feb 2012 17:39:53 +0000

```Paul's post makes me suspect that my using the name -ridit()- for the -egen- function in -egenmore- (SSC) was a bad idea. That said, I don't feel energetic enough to change it, not least because I suspect that the original sense of Bross is no longer widely used. I do respect history, however.

That interesting detail aside, I have used ridits, or the same beast by any other name, in this way.

When plotting cumulative probabilities for an ordinal variable, using the operator < produces a lower probability of 0 and using the operator <= produces an upper probability of 1. Neither treats categories symmetrically and both are inconvenient if you feel tempted to plot on a logit scale, as I often am. So, using centred cumulative probabilities, meaning probability of values below + (1/2) probability of this value, are useful graphically. The point is discussed, indeed laboured, in Section 5 of

SJ-4-2  gr0004  .  Speaking Stata: Graphing categorical and compositional data
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
Q2/04   SJ 4(2):190--215                                 (no commands)
discusses graphical possibilities for categorical and
compositional data

and in the help file of -distplot- (latest public version downloadable from SJ 10-1 files).

A grandparent of all this is possibly

Galton, F. 1907. Grades and deviates. Biometrika 5: 400-406.

In a now widely used notation and terminology Galton proposed plotting positions (i - 1/2) / n.

Thanks as ever to Roger & Nick for their work on this.
I went back to the original article by Bross to try to understand what ridits were:
Bross, I. D. J.  1958.  How to use ridit analysis.  Biometrics 14: 38-58.
http://www.jstor.org/stable/2527727

The name ridit actually derives by analogy from probit and logit,
and is in part "Relative to an Identified Distribution".
Neither -wridit- not -egen , ridit()- have options to specify the identified distribution,
as Bross intended.

However, it is possible to fudge it:
Assume I want ridits for rep78 in the auto dataset, and want to use the US subsample
as my Identified Distribution (or reference distribution).

**************************
* Begin example analysis *
**************************
sysuse auto, clear
egen ridit_rep78_usa  = ridit( rep78) if foreign == 0
bys rep78 (foreign) : replace ridit_rep78_usa  =  ridit_rep78_usa[1] if foreign[1] == 0
bys foreign : summ ridit_rep78_usa

* Contrast this with using all the data as the reference group:

egen ridit_rep78 = ridit( rep78)
bys foreign : summ ridit_rep78

* The means have shifted by a fixed amount, and the SD have changed slightly.

**************************
*  End example analysis  *
**************************

```