Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: Herfindahl, segregation index |
Date | Wed, 26 Jan 2011 17:38:06 +0000 |
Here's a dopey example. I'm going to treat -rep78- in the auto data as a categorical variable and calculate Simpson (= Gini) diversity, which I define just as the sum of squared proportions. To add a bit of a challenge, I'll do that separately by -foreign-. . sysuse auto, clear (1978 Automobile Data) I am fond of using -contract- to reduce to a dataset of frequencies, rather than -collapse-, but -collapse- would do the job too. . contract foreign rep78, nomiss . l +--------------------------+ | rep78 foreign _freq | |--------------------------| 1. | 1 Domestic 2 | 2. | 2 Domestic 8 | 3. | 3 Domestic 27 | 4. | 4 Domestic 9 | 5. | 5 Domestic 2 | |--------------------------| 6. | 3 Foreign 3 | 7. | 4 Foreign 9 | 8. | 5 Foreign 9 | +--------------------------+ . ineq rep78, by(foreign) gensim(simpson) ---------------------------------------------------------- Car type | freq Simpson entropy dissim. ----------+----------------------------------------------- Domestic | 5 0.244 1.490 0.200 Foreign | 3 0.347 1.078 0.083 ---------------------------------------------------------- . l +-------------------------------------+ | rep78 foreign _freq simpson | |-------------------------------------| 1. | 1 Domestic 2 .2444445 | 2. | 2 Domestic 8 .2444445 | 3. | 3 Domestic 27 .2444445 | 4. | 4 Domestic 9 .2444445 | 5. | 5 Domestic 2 .2444445 | |-------------------------------------| 6. | 3 Foreign 3 .3472222 | 7. | 4 Foreign 9 .3472222 | 8. | 5 Foreign 9 .3472222 | +-------------------------------------+ . collapse simpson, by(foreign) . l +---------------------+ | foreign simpson | |---------------------| 1. | Domestic .2444445 | 2. | Foreign .3472222 | +---------------------+ You must install -ineq- from SSC first. Nick n.j.cox@durham.ac.uk Nick Cox -ineq- (SSC) will work on what are here called unit-record data. You just need to -contract- first. Nick n.j.cox@durham.ac.uk Austin Nichols Tomeka Davis <soctmd@langate.gsu.edu>: I had a look at -seg- and it does not seem to support weights, and it does not operate on unit record data, so you would have to -collapse- or otherwise modify an individual-level dataset (probably using weights) to prepare it for -seg-. -seg- seems to be designed mostly for use on US Census tract- or block-level data. If you had tract-level data with shares already defined as variables, the HHI would be computed with a single call to -generate- e.g. . gen hhi=white^2+black^2+other^2 so I assume you don't have that simple situation. Here is an example that demonstrates the closest parallel of the output of -seg- to HHI: webuse nhanes2, clear * pretend data is unweighted ta race qui levelsof race, loc(vs) qui foreach v of loc vs { egen sh`v'=mean(race==`v'), by(region smsa) replace sh`v'=sh`v'^2 la var sh`v' "sq. share race==`v'" } su sh* egen hhi=rowtotal(sh*) bys region smsa:g two=(_n>1) li region smsa sh* hhi if two==0, noo sepby(region) * stop pretending data is unweighted egen gp=group(region smsa) qui levelsof gp, loc(gs) qui foreach v of loc vs { tempvar vi g `vi'=race==`v' g ws`v'=. la var ws`v' "wtd. sq. share race==`v'" foreach g of loc gs { su `vi' if gp==`g' [aw=finalwgt], mean replace ws`v'=r(mean)^2 if gp==`g' } } egen whhi=rowtotal(ws*) li region smsa hhi whhi if two==0, noo sepby(region) g white=race==1 collapse white black orace hhi whhi ws? sh? [pw=finalwgt], by(region smsa gp) g norm=(1-whhi)*3/2 qui seg white black orace, by(gp) gen(i indx) p li region smsa whhi norm indx, noo sepby(region) Note that if you need to use -seg- on unit-record data, you will first collapse, then run -seg-, then save under a new name, then go back to your original data and merge on the output. On Wed, Jan 26, 2011 at 10:44 AM, Austin Nichols <austinnichols@gmail.com> wrote: > Tomeka Davis <soctmd@langate.gsu.edu> : > If you want the HHI, calculate the sum of squared shares directly, > perhaps using -egen- or -by- a couple of times, but if you want to use > the user-written -seg- on SSC you should check out its references, > particularly the 2002 paper by the same author: > > James, David R. and Karl E. Taeuber. 1985. "Measures of segregation." > Sociological Methodology 14:1-32 > Massey, Douglas S. and Nancy A. Denton. 1988. "The dimensions of racial > segregation." Social Forces 67:281-315. > Reardon, Sean F., and Glenn Firebaugh. 2002. "Measures of multigroup > segregation." Sociological Methodology 32: 33-67. > White, Michael J. 1986. "Segregation and diversity measures in population > distribution." Population Index 52:198-221. > Zoloth, Barbara S. 1976. "Alternative measures of school segregation." Land > Economics 52:278-298. > > On Wed, Jan 26, 2011 at 9:23 AM, Tomeka Davis <soctmd@langate.gsu.edu> wrote: >> Hello - >> >> I would like to compute a racial segregation index for a set of data. I know -seg- will allow me to do this, but I am not clear on which of the indices computed by -seg- is similar to the Herfindahl. I would appreciate any advice. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/