[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Measure of Variability in a Nominal Variable

From   Maarten buis <>
Subject   Re: st: Measure of Variability in a Nominal Variable
Date   Mon, 3 Mar 2008 22:52:57 +0000 (GMT)

> --- Kevin Daley <> wrote:
> > I would like to use a statistic discussed by Agresti in his
> > categorical data analysis book that gives the probability that two
> > randomly selected independent observations in a given dataset will
> > end up in different categories of the given variable.  The
> > statistic has a minimum value of 0 and a maximum value of J-1.  

--- Maarten buis <> wrote:
> If it is a probability than the maximum is 1. In that case you could
> compute it as follows:
> *---------- begin example -------------
> sysuse auto, clear
> preserve
> contract rep78 , percent(p) nomiss
> gen double psq = (p/100)^2
> sum psq, meanonly
> di 1-r(sum)
> restore
> *--------- end example -----------------
> (For more on how to use examples I sent to the Statalist, see
> )

In the case above the two draws are draws with replacement, in which
case the maximum is 1-1/_N. The maximum variability is obtained when
each observation is in its own category, so there are _N categories
each with a probability of 1/_N. The probability of drawing the one
particular category twice is (1/_N)^2, and there are _N such
categories, so the probability of drawing a category twice is
_N*(1/_N)^2 is 1/_N. the probability of not drawing a category twice is

-- Maarten

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Sent from Yahoo! Mail.
A Smarter Inbox.
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index