Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: New package -wridit- on SSC


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: New package -wridit- on SSC
Date   Tue, 28 Feb 2012 17:39:53 +0000

Paul's post makes me suspect that my using the name -ridit()- for the -egen- function in -egenmore- (SSC) was a bad idea. That said, I don't feel energetic enough to change it, not least because I suspect that the original sense of Bross is no longer widely used. I do respect history, however. 

That interesting detail aside, I have used ridits, or the same beast by any other name, in this way. 
 
When plotting cumulative probabilities for an ordinal variable, using the operator < produces a lower probability of 0 and using the operator <= produces an upper probability of 1. Neither treats categories symmetrically and both are inconvenient if you feel tempted to plot on a logit scale, as I often am. So, using centred cumulative probabilities, meaning probability of values below + (1/2) probability of this value, are useful graphically. The point is discussed, indeed laboured, in Section 5 of 

SJ-4-2  gr0004  .  Speaking Stata: Graphing categorical and compositional data
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/04   SJ 4(2):190--215                                 (no commands)
        discusses graphical possibilities for categorical and
        compositional data

and in the help file of -distplot- (latest public version downloadable from SJ 10-1 files). 

A grandparent of all this is possibly 

Galton, F. 1907. Grades and deviates. Biometrika 5: 400-406.

In a now widely used notation and terminology Galton proposed plotting positions (i - 1/2) / n. 

Nick 
n.j.cox@durham.ac.uk 


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Seed, Paul
Sent: 28 February 2012 16:41
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: New package -wridit- on SSC

Thanks as ever to Roger & Nick for their work on this.
I went back to the original article by Bross to try to understand what ridits were:
Bross, I. D. J.  1958.  How to use ridit analysis.  Biometrics 14: 38-58.
http://www.jstor.org/stable/2527727

The name ridit actually derives by analogy from probit and logit, 
and is in part "Relative to an Identified Distribution".
Neither -wridit- not -egen , ridit()- have options to specify the identified distribution,
as Bross intended.

However, it is possible to fudge it:
Assume I want ridits for rep78 in the auto dataset, and want to use the US subsample 
as my Identified Distribution (or reference distribution).

**************************
* Begin example analysis *
**************************
sysuse auto, clear
egen ridit_rep78_usa  = ridit( rep78) if foreign == 0
bys rep78 (foreign) : replace ridit_rep78_usa  =  ridit_rep78_usa[1] if foreign[1] == 0
bys foreign : summ ridit_rep78_usa 

* Contrast this with using all the data as the reference group:

egen ridit_rep78 = ridit( rep78) 
bys foreign : summ ridit_rep78

* The means have shifted by a fixed amount, and the SD have changed slightly.

**************************
*  End example analysis  *
**************************

> Roger B. Newson
>
> Thanks as always to Kit Baum, a new package -wridit- is now available
> for download from SSC. In Stata, use the -ssc- command to do this.
>
> The -wridit- package is described as below on my website, and calculayes
> weighted ridits for a variable. Zero weights are allowed, in which case
> the ridits for the observations with zero weights are relative to the
> weight distribution in the observations with non-zero weights. Ridits,
> and the left, right and central inverse ridits, are important in rank
> statistics, which, strictly speaking, are really ridit statistics. They
> are also potentially useful in spline statistics, where the user might
> want to define a spline in the ridit of an X-variable, instead of in the
> X-variable itself.
>
> I would like to thank Nick Cox for writing the -ridit()- function of the
> -egenmore- package, which generates unweighted ridits, and from which I
> borrowed a few ideas for -wridit-. I slightly revised the algorithm for
> -wridit-, in order to avoid the small numerical accuracy issues
> associated with adding a small probability to a large probability.
>
> ---------------------------------------------------------------------------
> package wridit from http://www.imperial.ac.uk/nhli/r.newson/stata10
> ---------------------------------------------------------------------------
>
> TITLE
>         wridit: Generate weighted ridits
>
> DESCRIPTION/AUTHOR(S)
>         wridit inputs a variable and generates its weighted ridits.
>         If no weights are provided, then all weights are assumed
>         equal to 1, so unweighted ridits are generated.
>
>         Author: Roger Newson
>         Distribution-Date: 22february2012
>         Stata-Version: 10
>
> INSTALLATION FILES                                  (click here to install)
>         wridit.ado
>         wridit.sthlp
> ---------------------------------------------------------------------------
> (click here to return to the previous screen)
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index