# st: RV: RE: RV: RE: Sum over all possible combinations

 From "Julio Estevez" To statalist@hsphsun2.harvard.edu Subject st: RV: RE: RV: RE: Sum over all possible combinations Date Mon, 23 Jul 2007 21:17:42 -0500

Following Nick's suggestion let me explain the _real_problem

What I need to do is construct a Polarization measure, in particulat the one presented by Esteban and Ray (Econometrica, 1994):

suppose yi represents the average income of the typical individual belonging to class i. The population is divided into n groups.

I need to construct an indicator that sums over all i and j the following term:

wij*abs(yi - yj)

wij, the weighting scheme, has the following form:

wij = (si^d)*(sj)

where si is the population weight of group i, sj is the population weight of group j and d is constant.

Hope this helps and somebody may help me finding a solution.

Julio

-----Mensaje original-----
De: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] En nombre de Nick Cox
Enviado el: Lunes, 23 de Julio de 2007 01:12 p.m.
Para: statalist@hsphsun2.harvard.edu
Asunto: st: RE: RV: RE: Sum over all possible combinations

That can happen if you don't tell us the _real_ problem.
And you haven't yet, as "a somewhat complex weighting scheme"
could mean various things.

But yes, Mata would help a lot. You just need to loop doubly over your data.
Sorting your data and looping only over non-negative differences and then
doubling might help.

Nick
n.j.cox@durham.ac.uk

J E

```Thanks to Nick and to Sebastian for their initial responses to my
query I am afraid, however, that both responses, do not fully solve
the problem I need to tackle

In the case of Nick's comment (that the Sum of absolute differences
over all possible combinations is related to the gini coefficient and
to lmoment
indicators): is true. However since I need to reweight each absolute
difference by a somewhat complex weighting scheme (I did not posted
the weights to avoid adding complexities to my query), the shortcut
suggested by Nick would not work in my case.

The strategy proposed by Sebastian does work:

bys group: gen g_i=_n
tempfile join_file
preserve
ren y_i y_k
save `join_file', replace
restore
joinby group using `join_file', unmatched(none) bys group: egen sum_dy
= sum(abs(y_i - y_k)) bys group g_i: keep if _n==1

However it only works for relatively small data sets. Very rapidly the
strategy eats all my memory since it needs to save in memory a data
base containing all possible combinations.

Giving this let me ask if somebody will know an alternative to
Sebastian strategy that will not require this intermediate step.
Alternatively, will
the use of mata will allow me to get around this problem.
```
_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/