Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Herfindahl, segregation index

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	RE: st: Herfindahl, segregation index
Date	Wed, 26 Jan 2011 17:53:33 +0000

Sorry; in my excitement I processed the wrong variable. A corrected script is 

sysuse auto, clear
contract foreign rep78, nomiss
ineq _freq, by(foreign) gensim(simpson)
collapse simpson, by(foreign)

Nick 
[email protected] 

Nick Cox

Here's a dopey example. I'm going to treat -rep78- in the auto data as a categorical variable and calculate Simpson (= Gini) diversity, which I define just as the sum of squared proportions. To add a bit of a challenge, I'll do that separately by -foreign-. 

. sysuse auto, clear
(1978 Automobile Data)

I am fond of using -contract- to reduce to a dataset of frequencies, rather than -collapse-, but -collapse- would do the job too. 

. contract foreign rep78, nomiss

. l

     +--------------------------+
     | rep78    foreign   _freq |
     |--------------------------|
  1. |     1   Domestic       2 |
  2. |     2   Domestic       8 |
  3. |     3   Domestic      27 |
  4. |     4   Domestic       9 |
  5. |     5   Domestic       2 |
     |--------------------------|
  6. |     3    Foreign       3 |
  7. |     4    Foreign       9 |
  8. |     5    Foreign       9 |
     +--------------------------+

. ineq rep78, by(foreign) gensim(simpson)

----------------------------------------------------------
 Car type |       freq     Simpson     entropy     dissim.
----------+-----------------------------------------------
 Domestic |          5       0.244       1.490       0.200
  Foreign |          3       0.347       1.078       0.083
----------------------------------------------------------

. l

     +-------------------------------------+
     | rep78    foreign   _freq    simpson |
     |-------------------------------------|
  1. |     1   Domestic       2   .2444445 |
  2. |     2   Domestic       8   .2444445 |
  3. |     3   Domestic      27   .2444445 |
  4. |     4   Domestic       9   .2444445 |
  5. |     5   Domestic       2   .2444445 |
     |-------------------------------------|
  6. |     3    Foreign       3   .3472222 |
  7. |     4    Foreign       9   .3472222 |
  8. |     5    Foreign       9   .3472222 |
     +-------------------------------------+

. collapse simpson, by(foreign)

. l

     +---------------------+
     |  foreign    simpson |
     |---------------------|
  1. | Domestic   .2444445 |
  2. |  Foreign   .3472222 |
     +---------------------+

You must install -ineq- from SSC first. 

Nick 
[email protected] 

Nick Cox

-ineq- (SSC) will work on what are here called unit-record data. 

You just need to -contract- first. 

Nick 
[email protected] 

Austin Nichols

Tomeka Davis <[email protected]>:
I had a look at -seg- and it does not seem to support weights, and it
does not operate on unit record data, so you would have to -collapse-
or otherwise modify an individual-level dataset (probably using
weights) to prepare it for -seg-. -seg- seems to be designed mostly
for use on US Census tract- or block-level data. If you had
tract-level data with shares already defined as variables, the HHI
would be computed with a single call to -generate- e.g.
. gen hhi=white^2+black^2+other^2
so I assume you don't have that simple situation. Here is an example
that demonstrates the closest parallel of the output of -seg- to HHI:

webuse nhanes2, clear
* pretend data is unweighted
ta race
qui levelsof race, loc(vs)
qui foreach v of loc vs {
 egen sh`v'=mean(race==`v'), by(region smsa)
 replace sh`v'=sh`v'^2
 la var sh`v' "sq. share race==`v'"
 }
su sh*
egen hhi=rowtotal(sh*)
bys region smsa:g two=(_n>1)
li region smsa sh* hhi if two==0, noo sepby(region)
* stop pretending data is unweighted
egen gp=group(region smsa)
qui levelsof gp, loc(gs)
qui foreach v of loc vs {
 tempvar vi
 g `vi'=race==`v'
 g ws`v'=.
 la var ws`v' "wtd. sq. share race==`v'"
 foreach g of loc gs {
  su `vi' if gp==`g' [aw=finalwgt], mean
  replace ws`v'=r(mean)^2 if gp==`g'
  }
 }
egen whhi=rowtotal(ws*)
li region smsa hhi whhi if two==0, noo sepby(region)
g white=race==1
collapse white black orace hhi whhi ws? sh? [pw=finalwgt], by(region smsa gp)
g norm=(1-whhi)*3/2
qui seg white black orace, by(gp) gen(i indx) p
li region smsa whhi norm indx, noo sepby(region)

Note that if you need to use -seg- on unit-record data, you will first
collapse, then run -seg-, then save under a new name, then go back to
your original data and merge on the output.

On Wed, Jan 26, 2011 at 10:44 AM, Austin Nichols
<[email protected]> wrote:
> Tomeka Davis <[email protected]> :
> If you want the HHI, calculate the sum of squared shares directly,
> perhaps using -egen- or -by- a couple of times, but if you want to use
> the user-written -seg- on SSC you should check out its references,
> particularly the 2002 paper by the same author:
>
> James, David R. and Karl E. Taeuber. 1985. "Measures of segregation."
>      Sociological Methodology 14:1-32
> Massey, Douglas S. and Nancy A. Denton. 1988. "The dimensions of racial
>      segregation." Social Forces 67:281-315.
> Reardon, Sean F., and Glenn Firebaugh. 2002. "Measures of multigroup
>      segregation."  Sociological Methodology 32: 33-67.
> White, Michael J. 1986. "Segregation and diversity measures in population
>      distribution." Population Index 52:198-221.
> Zoloth, Barbara S. 1976. "Alternative measures of school segregation." Land
>      Economics 52:278-298.
>
> On Wed, Jan 26, 2011 at 9:23 AM, Tomeka Davis <[email protected]> wrote:
>> Hello -
>>
>> I would like to compute a racial segregation index for a set of data.  I know -seg- will allow me to do this, but I am not clear on which of the indices computed by -seg- is similar to the Herfindahl.  I would appreciate any advice.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Herfindahl, segregation index
  - From: "Tomeka Davis" <[email protected]>
- Re: st: Herfindahl, segregation index
  - From: Austin Nichols <[email protected]>
- Re: st: Herfindahl, segregation index
  - From: Austin Nichols <[email protected]>
- RE: st: Herfindahl, segregation index
  - From: Nick Cox <[email protected]>
- RE: st: Herfindahl, segregation index
  - From: Nick Cox <[email protected]>

Prev by Date: RE: st: Time between 2 dates
Next by Date: Re: st: Herfindahl, segregation index
Previous by thread: RE: st: Herfindahl, segregation index
Next by thread: Re: st: Herfindahl, segregation index
Index(es):
- Date
- Thread