Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Measure of Variability in a Nominal Variable


From   Maarten buis <[email protected]>
To   [email protected]
Subject   RE: st: Measure of Variability in a Nominal Variable
Date   Tue, 4 Mar 2008 20:53:39 +0000 (GMT)

A alternative that would fit the desciption given by Kevin is:
Agresti (1996) An Introduction to Categorical Data Analysis. Hoboken
NJ: John Wiley. 

Also the reference given by Nick is the second edition, which is much
expanded from the the first edition.

-- Maarten

--- Nick Cox <[email protected]> wrote:

> A reference, as requested by Steven Samuels in his question to Kevin
> Daley, is 
> 
> Agresti, A. 2002. Categorical data analysis. Hoboken NJ: John Wiley. 
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 04 March 2008 18:04
> To: [email protected]
> Subject: RE: st: Measure of Variability in a Nominal Variable
> 
> If p_i is proportion in category i, 
> then SUM p_i^2 is the probability of being in the same category. 
> (The sum is over categories, not observations.) 
> 
> The complement 1 - SUM p_i^2 is 
> then the probability of being in different categories. 
> 
> The reciprocal 1 / SUM p_i^2 has a nice interpretation as the
> equivalent
> number
> of equally probable categories. 
> 
> One or more of these quantities arise under many different names 
> 
> 	Gini index (but NB that many other measures have also been
> called that) 
> 
> 	Simpson index in ecology (the same Simpson as Simpson's paradox)
> 
> 
> 	Herfindahl index in economics 
> 
> 	heterozygosity in genetics
> 
> And no doubt others. 
> 
> Maarten gave one way to calculate it. Another is through -ineq- on
> SSC. 
> 
> Nick 
> n.j.cox
> 
> Maarten buis
> 
> > --- Kevin Daley <[email protected]> wrote:
> > > I would like to use a statistic discussed by Agresti in his
> > > categorical data analysis book that gives the probability that
> two
> > > randomly selected independent observations in a given dataset
> will
> > > end up in different categories of the given variable.  The
> > > statistic has a minimum value of 0 and a maximum value of J-1.  
> 
> --- Maarten buis <[email protected]> wrote:
> > If it is a probability than the maximum is 1. In that case you
> could
> > compute it as follows:
> > 
> > *---------- begin example -------------
> > sysuse auto, clear
> > preserve
> > contract rep78 , percent(p) nomiss
> > gen double psq = (p/100)^2
> > sum psq, meanonly
> > di 1-r(sum)
> > restore
> > *--------- end example -----------------
> > (For more on how to use examples I sent to the Statalist, see
> > http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )
> 
> In the case above the two draws are draws with replacement, in which
> case the maximum is 1-1/_N. The maximum variability is obtained when
> each observation is in its own category, so there are _N categories
> each with a probability of 1/_N. The probability of drawing the one
> particular category twice is (1/_N)^2, and there are _N such
> categories, so the probability of drawing a category twice is
> _N*(1/_N)^2 is 1/_N. the probability of not drawing a category twice
> is
> 1-1/_N.
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


      __________________________________________________________
Sent from Yahoo! Mail.
A Smarter Inbox. http://uk.docs.yahoo.com/nowyoucan.html
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index