[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Michael Blasnik" <[email protected]> |

To |
<[email protected]> |

Subject |
st: Re: Finding and tagging overlapping groups |

Date |
Fri, 29 Jul 2005 19:39:31 -0400 |

I have had need for similar twisted data management puzzles. The program below might do the trick. It basically selects a value for one var, assigns it a group #, copies that group for that value of x, then copies the group # to all values of y that have that x, then copies them back to x for all values matching those y values and back once more through. It then flags those cases as done so they move to the back of the sort order for the next round through the loop. The group number get incremented and it keeps looping until every observation has been assigned a group #. I think it should work, but you should definitely test it.

program define formgrp

version 9.0

syntax varlist (min=2 max=2), gen(str)

tempvar done

gen byte `done'=0

tokenize `varlist'

local v1 "`1'"

local v2 "`2'"

gen `gen'=.

local group=1

qui count if `done'==0

while r(N)>0 {

bysort `done' (`v1' `v2'): replace `gen'=`group' if `v1'==`v1'[1]

bysort `done' `v2' (`gen'): replace `gen'=`gen'[1]

bysort `done' `v1' (`gen'): replace `gen'=`gen'[1]

bysort `done' `v2' (`gen'): replace `gen'=`gen'[1]

bysort `done' `v1' (`gen'): replace `gen'=`gen'[1]

qui replace `done'=1 if `done'==0 & `gen'<.

local group=`group'+1

qui count if `done'==0

}

end

you could then use it for your example case like this:

formgrp Z E, gen(group)

Michael Blasnik

[email protected]

----- Original Message ----- From: "Fredrik Wallenberg" <[email protected]>

To: <[email protected]>

Sent: Friday, July 29, 2005 7:09 PM

Subject: st: Finding and tagging overlapping groups

This is simply a reformulation of a question I sent out yesterday (and didn't get any responses to :) I have data sets that, when merged produce a table with many-to-many relationships. The table below contains the ID's from each table (Z and E) +----------+ | Z E | |----------| 1. | a x | 2. | b x | 3. | b z | 4. | c y | 5. | d z | |----------| 6. | e q | 7. | e z | +----------+ In as a base for further calculations I've created variables showing duplicates and overlap between groups: +----------------------------------+ | Z E zdup edup overlap | |----------------------------------| 1. | a x 0 1 0 | 2. | b x 1 1 1 | 3. | b z 1 2 1 | 4. | c y 0 0 0 | 5. | d z 0 2 0 | |----------------------------------| 6. | e q 1 0 0 | 7. | e z 1 2 1 | +----------------------------------+ What I need to do is to create a group variable for all records that are linked to each other through overlapping Z/E. In the example above I would like to end up with something like: +------------------+ | zip ex group | |------------------| 1. | a x 1 | 2. | b x 1 | 3. | b z 1 | 4. | c y 2 | 5. | d z 1 | |------------------| 6. | e q 1 | 7. | e z 1 | +------------------+ I've spent several days now trying to figure out how to do that in Stata/Filemaker/Excel and haven't solved it yet. Any help would be most welcome!!!! Fredrik

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: Re: Finding and tagging overlapping groups***From:*"Michael Blasnik" <[email protected]>

**References**:**st: Finding and tagging overlapping groups***From:*Fredrik Wallenberg <[email protected]>

- Prev by Date:
**st: how to test for multicollinearity with xtreg** - Next by Date:
**st: RE: Finding and tagging overlapping groups** - Previous by thread:
**st: Finding and tagging overlapping groups** - Next by thread:
**st: Re: Re: Finding and tagging overlapping groups** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |