Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Herfindahl, segregation index


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Herfindahl, segregation index
Date   Wed, 26 Jan 2011 12:53:24 -0500

Nick--
Are you using the frequency weights in your example? Also note that
-contract- does not allow pweights used on survey data, or aweights on
summary data.  My example  calculates HHI on the original data; the
-collapse- near the end of the code is only to show the connection to
the user-written program the poster asked about.  If one is intending
to use group- or region-level HHI or the like in a regression, say,
it's much more efficient not to collapse the data and have to merge it
back on.

On Wed, Jan 26, 2011 at 12:38 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> Here's a dopey example. I'm going to treat -rep78- in the auto data as a categorical variable and calculate Simpson (= Gini) diversity, which I define just as the sum of squared proportions. To add a bit of a challenge, I'll do that separately by -foreign-.
>
> . sysuse auto, clear
> (1978 Automobile Data)
>
> I am fond of using -contract- to reduce to a dataset of frequencies, rather than -collapse-, but -collapse- would do the job too.
>
> . contract foreign rep78, nomiss
>
> . l
>
>     +--------------------------+
>     | rep78    foreign   _freq |
>     |--------------------------|
>  1. |     1   Domestic       2 |
>  2. |     2   Domestic       8 |
>  3. |     3   Domestic      27 |
>  4. |     4   Domestic       9 |
>  5. |     5   Domestic       2 |
>     |--------------------------|
>  6. |     3    Foreign       3 |
>  7. |     4    Foreign       9 |
>  8. |     5    Foreign       9 |
>     +--------------------------+
>
> . ineq rep78, by(foreign) gensim(simpson)
>
> ----------------------------------------------------------
>  Car type |       freq     Simpson     entropy     dissim.
> ----------+-----------------------------------------------
>  Domestic |          5       0.244       1.490       0.200
>  Foreign |          3       0.347       1.078       0.083
> ----------------------------------------------------------
>
> . l
>
>     +-------------------------------------+
>     | rep78    foreign   _freq    simpson |
>     |-------------------------------------|
>  1. |     1   Domestic       2   .2444445 |
>  2. |     2   Domestic       8   .2444445 |
>  3. |     3   Domestic      27   .2444445 |
>  4. |     4   Domestic       9   .2444445 |
>  5. |     5   Domestic       2   .2444445 |
>     |-------------------------------------|
>  6. |     3    Foreign       3   .3472222 |
>  7. |     4    Foreign       9   .3472222 |
>  8. |     5    Foreign       9   .3472222 |
>     +-------------------------------------+
>
> . collapse simpson, by(foreign)
>
> . l
>
>     +---------------------+
>     |  foreign    simpson |
>     |---------------------|
>  1. | Domestic   .2444445 |
>  2. |  Foreign   .3472222 |
>     +---------------------+
>
> You must install -ineq- from SSC first.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Nick Cox
>
> -ineq- (SSC) will work on what are here called unit-record data.
>
> You just need to -contract- first.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Austin Nichols
>
> Tomeka Davis <soctmd@langate.gsu.edu>:
> I had a look at -seg- and it does not seem to support weights, and it
> does not operate on unit record data, so you would have to -collapse-
> or otherwise modify an individual-level dataset (probably using
> weights) to prepare it for -seg-. -seg- seems to be designed mostly
> for use on US Census tract- or block-level data. If you had
> tract-level data with shares already defined as variables, the HHI
> would be computed with a single call to -generate- e.g.
> . gen hhi=white^2+black^2+other^2
> so I assume you don't have that simple situation. Here is an example
> that demonstrates the closest parallel of the output of -seg- to HHI:
>
> webuse nhanes2, clear
> * pretend data is unweighted
> ta race
> qui levelsof race, loc(vs)
> qui foreach v of loc vs {
>  egen sh`v'=mean(race==`v'), by(region smsa)
>  replace sh`v'=sh`v'^2
>  la var sh`v' "sq. share race==`v'"
>  }
> su sh*
> egen hhi=rowtotal(sh*)
> bys region smsa:g two=(_n>1)
> li region smsa sh* hhi if two==0, noo sepby(region)
> * stop pretending data is unweighted
> egen gp=group(region smsa)
> qui levelsof gp, loc(gs)
> qui foreach v of loc vs {
>  tempvar vi
>  g `vi'=race==`v'
>  g ws`v'=.
>  la var ws`v' "wtd. sq. share race==`v'"
>  foreach g of loc gs {
>  su `vi' if gp==`g' [aw=finalwgt], mean
>  replace ws`v'=r(mean)^2 if gp==`g'
>  }
>  }
> egen whhi=rowtotal(ws*)
> li region smsa hhi whhi if two==0, noo sepby(region)
> g white=race==1
> collapse white black orace hhi whhi ws? sh? [pw=finalwgt], by(region smsa gp)
> g norm=(1-whhi)*3/2
> qui seg white black orace, by(gp) gen(i indx) p
> li region smsa whhi norm indx, noo sepby(region)
>
> Note that if you need to use -seg- on unit-record data, you will first
> collapse, then run -seg-, then save under a new name, then go back to
> your original data and merge on the output.
>
> On Wed, Jan 26, 2011 at 10:44 AM, Austin Nichols
> <austinnichols@gmail.com> wrote:
>> Tomeka Davis <soctmd@langate.gsu.edu> :
>> If you want the HHI, calculate the sum of squared shares directly,
>> perhaps using -egen- or -by- a couple of times, but if you want to use
>> the user-written -seg- on SSC you should check out its references,
>> particularly the 2002 paper by the same author:
>>
>> James, David R. and Karl E. Taeuber. 1985. "Measures of segregation."
>>      Sociological Methodology 14:1-32
>> Massey, Douglas S. and Nancy A. Denton. 1988. "The dimensions of racial
>>      segregation." Social Forces 67:281-315.
>> Reardon, Sean F., and Glenn Firebaugh. 2002. "Measures of multigroup
>>      segregation."  Sociological Methodology 32: 33-67.
>> White, Michael J. 1986. "Segregation and diversity measures in population
>>      distribution." Population Index 52:198-221.
>> Zoloth, Barbara S. 1976. "Alternative measures of school segregation." Land
>>      Economics 52:278-298.
>>
>> On Wed, Jan 26, 2011 at 9:23 AM, Tomeka Davis <soctmd@langate.gsu.edu> wrote:
>>> Hello -
>>>
>>> I would like to compute a racial segregation index for a set of data.  I know -seg- will allow me to do this, but I am not clear on which of the indices computed by -seg- is similar to the Herfindahl.  I would appreciate any advice.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index