Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Robert Picard <picard@netbox.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Creating household id for groups of persons |

Date |
Wed, 6 Jul 2011 13:01:26 -0400 |

Unless I'm mistaken, Fernando's solution will not always group correctly households. In the simple example below, there are 3 contracts with 4 different members of the same household. Such cases require more that one pass over the data (contract 13 groups id 2 and 4 and then contract 11 and 12 groups 1 2 3 4 together). * --------------------- begin example --------------------- clear all input contract id 11 1 11 2 12 3 12 4 13 2 13 4 end tempfile f qui save "`f'" * implement Fernando's approach egen cid = group(contract) bysort id: egen mincid = min(cid) bysort contract: egen hid = min(mincid) list , noobs clean * redo using -group_id- use "`f'", clear clonevar hid = id group_id hid, match(contract) list , noobs clean * --------------------- end example ----------------------- On Wed, Jul 6, 2011 at 11:47 AM, Hans Meier <mr.hans.meier@web.de> wrote: > Hello Austin and Robert, > > thank you for your solutions. > I'm sure they would work, but I have a very large dataset, so Austins solution would take hours, and for Roberts solution I would have to use SSC. > > But another Stata user sent me this solution: > > Von: "Fernando Rios Avila" <f.rios.a@gmail.com> > Gesendet: 06.07.2011 15:18:00 > An: "'Hans Meier'" <mr.hans.meier@web.de> > Betreff: RE: st: Creating household id for groups of persons > >>Hi Hans, >>I was playing around with a very small sample similar to yours, and come up with this small code. >>Here hid3 would be the household id code. >> >> egen hid1=group (contract) >> bysort id: egen hid2=min(hid) >> bysort contract:egen hid3=min(hid2) >> >>Hope this is what u were looking for. >>Best > > > It works perfect, and very fast. > > Thank you Fernando! > > > > -----Ursprüngliche Nachricht----- > Von: "Robert Picard" <picard@netbox.com> > Gesendet: 06.07.2011 16:50:42 > An: statalist@hsphsun2.harvard.edu > Betreff: Re: st: Creating household id for groups of persons > >>Or get -group_id- from SSC. Using Austin's data: >> >>* --------------------- begin example --------------------- >>clear all >>input contract id >> 123 1 >> 123 2 >> 123 3 >> 456 4 >> 456 5 >> 678 1 >> 456 3 >> 789 6 >> 789 7 >> 456 8 >>end >> >>clonevar gid = id >>group_id gid, match(contract) >> >>list , noobs clean >> >>* --------------------- begin example --------------------- >> >> >>On Wed, Jul 6, 2011 at 10:29 AM, Austin Nichols <austinnichols@gmail.com> wrote: >>> Hans Meier <mr.hans.meier@web.de>: >>> >>> Maybe this is what you want? >>> >>> clear all >>> input contract id >>> 123 1 >>> 123 2 >>> 123 3 >>> 456 4 >>> 456 5 >>> 678 1 >>> 456 3 >>> 789 6 >>> 789 7 >>> 456 8 >>> end >>> g long obs=_n >>> egen long i=group(id) >>> la var i "Person id from 1 to M" >>> egen long gp=group(contract) >>> la var gp "Contract id from 1 to G" >>> bys i (gp):g long ct=sum(gp!=gp[_n-1]) >>> la var ct "n distinct contract by id" >>> sort i ct >>> su i, mean >>> forv i=1/`r(max)' { >>> su ct if i==`i', mean >>> if r(max)==1 continue >>> loc max=r(max) >>> su gp if ct==1&i==`i', mean >>> loc g1=r(max) >>> forv j=2/`max' { >>> su gp if ct==`j'&i==`i', mean >>> replace gp=`g1' if gp==r(max) >>> } >>> } >>> sort obs >>> drop obs ct i >>> l, noo clean >>> >>> >>> >>> On Wed, Jul 6, 2011 at 8:45 AM, Hans Meier <mr.hans.meier@web.de> wrote: >>>> Yes, now you got my question right. >>>> I don't know who lives in in which household, and I also don't have further information about this. >>>> >>>> But I assume, that if people have an insurance contract together, they are somehow connected and I define them as one household. >>>> (I look only at non-life insurance, no pension funds etc.) >>>> >>>> In my example, I define the persons from contract "123" (id's "1", "2", "3") as one household, let's say household A, and those in contract "456" (id's "4", "5") as another household B. >>>> Now, in contract "678", the id "1" tells me that this is the same person who is also in the contract "123", so I want this contract to be put in household A. >>>> >>>> To your question: >>>> Unfortunately, I have a very large dataset, so I can't tell if I have one contract in each household that covers all household members. >>>> To err on the side of caution, I would rather assume I don't have such complete contracts. >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> >>* >>* For searches and help try: >>* http://www.stata.com/help.cgi?search >>* http://www.stata.com/support/statalist/faq >>* http://www.ats.ucla.edu/stat/stata/ > > > ___________________________________________________________ > Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die > Toolbar eingebaut! http://produkte.web.de/go/toolbar > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Recording episodes***From:*"Michael Ghebre" <m.a.ghebre@qmul.ac.uk>

**References**:**Re: st: Creating household id for groups of persons***From:*"Hans Meier" <mr.hans.meier@web.de>

- Prev by Date:
**RE: st: Creating household id for groups of persons** - Next by Date:
**st: Best Approximation Subject to Restrictions** - Previous by thread:
**st: Private emails to those active on Statalist** - Next by thread:
**st: Recording episodes** - Index(es):