Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

# Re: st: Creating household id for groups of persons

 From Robert Picard To statalist@hsphsun2.harvard.edu Subject Re: st: Creating household id for groups of persons Date Wed, 6 Jul 2011 13:01:26 -0400

```Unless I'm mistaken, Fernando's solution will not always group
correctly households. In the simple example below, there are 3
contracts with 4 different members of the same household. Such cases
require more that one pass over the data (contract 13 groups id 2 and
4 and then contract 11 and 12 groups 1 2 3 4 together).

* --------------------- begin example ---------------------
clear all
input contract id
11  1
11  2
12  3
12  4
13  2
13  4
end

tempfile f
qui save "`f'"

* implement Fernando's approach
egen cid = group(contract)
bysort id: egen mincid = min(cid)
bysort contract: egen hid = min(mincid)
list , noobs clean

* redo using -group_id-
use "`f'", clear
clonevar hid = id
group_id hid, match(contract)
list , noobs clean

* --------------------- end example -----------------------

On Wed, Jul 6, 2011 at 11:47 AM, Hans Meier <mr.hans.meier@web.de> wrote:
> Hello Austin and Robert,
>
> thank you for your solutions.
> I'm sure they would work, but I have a very large dataset, so Austins solution would take hours, and for Roberts solution I would have to use SSC.
>
> But another Stata user sent me this solution:
>
> Von: "Fernando Rios Avila" <f.rios.a@gmail.com>
> Gesendet: 06.07.2011 15:18:00
> An: "'Hans Meier'" <mr.hans.meier@web.de>
> Betreff: RE: st: Creating household id for groups of persons
>
>>Hi Hans,
>>I was playing around with a very small sample similar to yours, and come up with this small code.
>>Here hid3 would be the household id code.
>>
>> egen hid1=group (contract)
>> bysort id: egen hid2=min(hid)
>> bysort contract:egen hid3=min(hid2)
>>
>>Hope this is what u were looking for.
>>Best
>
>
> It works perfect, and very fast.
>
> Thank you Fernando!
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: "Robert Picard" <picard@netbox.com>
> Gesendet: 06.07.2011 16:50:42
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: Creating household id for groups of persons
>
>>Or get -group_id- from SSC. Using Austin's data:
>>
>>* --------------------- begin example ---------------------
>>clear all
>>input contract id
>> 123 1
>> 123 2
>> 123 3
>> 456 4
>> 456 5
>> 678 1
>> 456 3
>> 789 6
>> 789 7
>> 456 8
>>end
>>
>>clonevar gid = id
>>group_id gid, match(contract)
>>
>>list , noobs clean
>>
>>* --------------------- begin example ---------------------
>>
>>
>>On Wed, Jul 6, 2011 at 10:29 AM, Austin Nichols <austinnichols@gmail.com> wrote:
>>> Hans Meier <mr.hans.meier@web.de>:
>>>
>>> Maybe this is what you want?
>>>
>>> clear all
>>> input contract id
>>>  123  1
>>>  123  2
>>>  123  3
>>>  456  4
>>>  456  5
>>>  678  1
>>>  456  3
>>>  789  6
>>>  789  7
>>>  456  8
>>> end
>>> g long obs=_n
>>> egen long i=group(id)
>>> la var i "Person id from 1 to M"
>>> egen long gp=group(contract)
>>> la var gp "Contract id from 1 to G"
>>> bys i (gp):g long ct=sum(gp!=gp[_n-1])
>>> la var ct "n distinct contract by id"
>>> sort i ct
>>> su i, mean
>>> forv i=1/`r(max)' {
>>>  su ct if i==`i', mean
>>>  if r(max)==1 continue
>>>  loc max=r(max)
>>>  su gp if ct==1&i==`i', mean
>>>  loc g1=r(max)
>>>  forv j=2/`max' {
>>>  su gp if ct==`j'&i==`i', mean
>>>  replace gp=`g1' if gp==r(max)
>>>  }
>>>  }
>>> sort obs
>>> drop obs ct i
>>> l, noo clean
>>>
>>>
>>>
>>> On Wed, Jul 6, 2011 at 8:45 AM, Hans Meier <mr.hans.meier@web.de> wrote:
>>>> Yes, now you got my question right.
>>>> I don't know who lives in in which household, and I also don't have further information about this.
>>>>
>>>> But I assume, that if people have an insurance contract together, they are somehow connected and I define them as one household.
>>>> (I look only at non-life insurance, no pension funds etc.)
>>>>
>>>> In my example, I define the persons from contract "123" (id's "1", "2", "3") as one household, let's say household A, and those in contract "456" (id's "4", "5") as another household B.
>>>> Now, in contract "678", the id "1" tells me that this is the same person who is also in the contract "123", so I want this contract to be put in household A.
>>>>
>>>> To your question:
>>>> Unfortunately,  I have a very large dataset, so I can't tell if I have one contract in each household that covers all household members.
>>>> To err on the side of caution, I would rather assume I don't have such complete contracts.
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>>*
>>* For searches and help try:
>>* http://www.stata.com/help.cgi?search
>>* http://www.stata.com/support/statalist/faq
>>* http://www.ats.ucla.edu/stat/stata/
>
>
> ___________________________________________________________
> Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die
> Toolbar eingebaut! http://produkte.web.de/go/toolbar
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index