> --- Kevin Daley <kevin.daley@mail.mcgill.ca> wrote: > > I would like to use a statistic discussed by Agresti in his > > categorical data analysis book that gives the probability that two > > randomly selected independent observations in a given dataset will > > end up in different categories of the given variable. The > > statistic has a minimum value of 0 and a maximum value of J-1. --- Maarten buis <maartenbuis@yahoo.co.uk> wrote: > If it is a probability than the maximum is 1. In that case you could > compute it as follows: > > *---------- begin example ------------- > sysuse auto, clear > preserve > contract rep78 , percent(p) nomiss > gen double psq = (p/100)^2 > sum psq, meanonly > di 1-r(sum) > restore > *--------- end example ----------------- > (For more on how to use examples I sent to the Statalist, see > http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html ) In the case above the two draws are draws with replacement, in which case the maximum is 1-1/_N. The maximum variability is obtained when each observation is in its own category, so there are _N categories each with a probability of 1/_N. The probability of drawing the one particular category twice is (1/_N)^2, and there are _N such categories, so the probability of drawing a category twice is _N*(1/_N)^2 is 1/_N. the probability of not drawing a category twice is 1-1/_N. -- Maarten ----------------------------------------- Maarten L. Buis Department of Social Research Methodology Vrije Universiteit Amsterdam Boelelaan 1081 1081 HV Amsterdam The Netherlands visiting address: Buitenveldertselaan 3 (Metropolitan), room Z434 +31 20 5986715 http://home.fsw.vu.nl/m.buis/ ----------------------------------------- __________________________________________________________ Sent from Yahoo! Mail. A Smarter Inbox. http://uk.docs.yahoo.com/nowyoucan.html * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

