You're not mistaken, now I saw the mistake too. Now I used your group_id solution, works also very fast. Thank you! The reason why Austins solution didn't fit for my data is, that I have over 500000 contracts, and it would take hours to use "forv i=1/`r(max)" on them. -----Ursprüngliche Nachricht----- Von: "Robert Picard" <picard@netbox.com> Gesendet: 06.07.2011 19:01:26 An: statalist@hsphsun2.harvard.edu Betreff: Re: st: Creating household id for groups of persons >Unless I'm mistaken, Fernando's solution will not always group >correctly households. In the simple example below, there are 3 >contracts with 4 different members of the same household. Such cases >require more that one pass over the data (contract 13 groups id 2 and >4 and then contract 11 and 12 groups 1 2 3 4 together). > >* --------------------- begin example --------------------- >clear all >input contract id > 11 1 > 11 2 > 12 3 > 12 4 > 13 2 > 13 4 >end > >tempfile f >qui save "`f'" > >* implement Fernando's approach >egen cid = group(contract) >bysort id: egen mincid = min(cid) >bysort contract: egen hid = min(mincid) >list , noobs clean > >* redo using -group_id- >use "`f'", clear >clonevar hid = id >group_id hid, match(contract) >list , noobs clean > >* --------------------- end example ----------------------- > > > >On Wed, Jul 6, 2011 at 11:47 AM, Hans Meier <mr.hans.meier@web.de> wrote: >> Hello Austin and Robert, >> >> thank you for your solutions. >> I'm sure they would work, but I have a very large dataset, so Austins solution would take hours, and for Roberts solution I would have to use SSC. >> >> But another Stata user sent me this solution: >> >> Von: "Fernando Rios Avila" <f.rios.a@gmail.com> >> Gesendet: 06.07.2011 15:18:00 >> An: "'Hans Meier'" <mr.hans.meier@web.de> >> Betreff: RE: st: Creating household id for groups of persons >> >>>Hi Hans, >>>I was playing around with a very small sample similar to yours, and come up with this small code. >>>Here hid3 would be the household id code. >>> >>> egen hid1=group (contract) >>> bysort id: egen hid2=min(hid) >>> bysort contract:egen hid3=min(hid2) >>> >>>Hope this is what u were looking for. >>>Best >> >> >> It works perfect, and very fast. >> >> Thank you Fernando! >> >> >> >> -----Ursprüngliche Nachricht----- >> Von: "Robert Picard" <picard@netbox.com> >> Gesendet: 06.07.2011 16:50:42 >> An: statalist@hsphsun2.harvard.edu >> Betreff: Re: st: Creating household id for groups of persons >> >>>Or get -group_id- from SSC. Or get -group_id- from SSC. Using Austin's data:

* --------------------- begin example ---------------------
clear all
input contract id
 123 1
 123 2
 123 3
 456 4
 456 5
 678 1
 456 3
 789 6
 789 7
 456 8
end

clonevar gid = id
group_id gid, match(contract)

list , noobs clean

* --------------------- begin example ---------------------



On Wed, Jul 6, 2011 at 10:29 AM, Austin Nichols <austinnichols@gmail.com> wrote:
> Hans Meier <mr.hans.meier@web.de>:
>
> Maybe this is what you want?
>
> clear all
> input contract id
> 123 1
> 123 2
> 123 3
> 456 4
> 456 5
> 678 1
> 456 3
> 789 6
> 789 7
> 456 8
> end
> g long obs=_n
> egen long i=group(id)
> la var i "Person id from 1 to M"
> egen long gp=group(contract)
> la var gp "Contract id from 1 to G"
> bys i (gp):g long ct=sum(gp!=gp[_n-1])
> la var ct "n distinct contract by id"
> sort i ct
> su i, mean
> forv i=1/`r(max)' {
> su ct if i==`i', mean
> if r(max)==1 continue
> loc max=r(max)
> su gp if ct==1&i==`i', mean
> loc g1=r(max)
> forv j=2/`max' {
> su gp if ct==`j'&i==`i', mean
> replace gp=`g1' if gp==r(max)
> }
> }
> sort obs
> drop obs ct i
> l, noo clean
>
>
>
> On Wed, Jul 6, 2011 at 8:45 AM, Hans Meier <mr.hans.meier@web.de> wrote:
>> Yes, now you got my question right.
>> I don't know who lives in in which household, and I also don't have further information about this.
>>
>> But I assume, that if people have an insurance contract together, they are somehow connected and I define them as one household.
>> (I look only at non-life insurance, no pension funds etc.)
>>
>> In my example, I define the persons from contract "123" (id's "1", "2", "3") as one household, let's say household A, and those in contract "456" (id's "4", "5") as another household B.
>> Now, in contract "678", the id "1" tells me that this is the same person who is also in the contract "123", so I want this contract to be put in household A.
>>
>> To your question:
>> Unfortunately, I have a very large dataset, so I can't tell if I have one contract in each household that covers all household members.
>> To err on the side of caution, I would rather assume I don't have such complete contracts. WEB.DE hat einen genialen Phishing-Filter in die
Toolbar eingebaut! http://produkte.web.de/go/toolbar

