# Re: st: Re: Combining multiple observations into one observation with multiple variables

 From Sven-Oliver Spieß To statalist@hsphsun2.harvard.edu Subject Re: st: Re: Combining multiple observations into one observation with multiple variables Date Wed, 30 Jun 2010 10:57:25 +0200

```Hi Conor

Generally 'reshape' would do that. Does the following example point in
the right direction?

===================================
input id char
1 1
1 3
1 7
1 11
2 1
2 8
3 2
3 7
3 13
end

bysort id (char): gen count = _n
reshape wide char, i(id) j(count)
list
===================================

Best,
Sven-Oliver

On Wed, Jun 30, 2010 at 09:06, Conor Hughes <cbhughes@uchicago.edu> wrote:
> Sorry, my tables got smushed:
> Dataset1
> ----------------------------------------
> household id | individual id
> ----------------------------------------
>         1        |        1
>         1        |        2
>         1        |        3
>         2        |        1
>         2        |        2
>         3        |        1
>         3        |        2
>
> Dataset 2
> -----------------------------------------------------------
> household id | household characteristic id
> ------------------------------------------------------------
>         1        |                 1
>         1        |                 3
>         1        |                 7
>         1        |                11
>         2        |                 1
>         2        |                 8
>         3        |                 2
>         3        |                 7
>         3        |                13
>
>
> On Wed, Jun 30, 2010 at 1:40 PM, Conor Hughes <cbhughes@uchicago.edu> wrote:
>> Hi All,
>> I have a couple of survey datasets that I need to merge, but they're
>> organized in an inconvenient way.  The first is organized by
>> household, and individuals within the household.  The second is only
>> organized by household.  I'd like to do a many-to-one merge on
>> household, so as to preserve the individual id's.  However, in the
>> second dataset, rather than adding household characteristics as
>> variables, it adds them as observations, e.g.:
>>
>> Dataset 1                                                          Dataset 2
>> -------------------------------------
>> -----------------------------------------------------------
>> household id | individual id                        household id |
>> household characteristic id
>> -------------------------------------
>> ------------------------------------------------------------
>>          1        |        1
>> 1        |            1
>>          1        |        2
>> 1        |            3
>>          1        |        3
>> 1        |            7
>>          2        |        1
>> 1        |            11
>>          2        |        2
>> 2        |             1
>>          3        |        1
>> 2        |             8
>>          3        |        2
>> 3        |             2
>>
>> 3        |             7
>>
>> 3        |             13
>> I'd prefer, in the second dataset, to have one observation for each
>> household, including household characteristics as dummy variables.  As
>> it is, the only way to get them together is via many-to-many merge,
>> which is foolish and doesn't work well, giving an output like
>> -------------------------------------------------------------------------------
>> household id | individual id | household characteristic id
>> -------------------------------------------------------------------------------
>>          1        |        1         |            1
>>          1        |        2         |            3
>>          1        |        3         |            7
>>          1        |        3         |            11
>>          2        |        1         |             1
>>          2        |        2         |             8
>>          3        |        1         |             2
>>          3        |        2         |             7
>>          3        |        2         |            13
>>    Which messes up the the first dataset, since it creates repeat
>> observations of individuals.  Is there a graceful way of the changing
>> the multiple observations per household in the second dataset to one
>> observation per household with characteristics represented as dummy
>> variables?  Any help would be greatly appreciated.  And please let me
>> know if I've described the situation poorly and you'd like
>> clarification.
>>
>> Cheers,
>> Conor
>>
>
>
>
> --
> Conor Hughes
> Mathematics and Economics
> University of Chicago 2011
>
>

```