Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Combining multiple observations into one observation with multiple variables


From   Sven-Oliver Spieß <svenoliverspiess@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Combining multiple observations into one observation with multiple variables
Date   Wed, 30 Jun 2010 10:57:25 +0200

Hi Conor


Generally 'reshape' would do that. Does the following example point in
the right direction?

===================================
input id char
1 1
1 3
1 7
1 11
2 1
2 8
3 2
3 7
3 13
end

bysort id (char): gen count = _n
reshape wide char, i(id) j(count)
list
===================================



Best,
Sven-Oliver


On Wed, Jun 30, 2010 at 09:06, Conor Hughes <cbhughes@uchicago.edu> wrote:
> Sorry, my tables got smushed:
> Dataset1
> ----------------------------------------
> household id | individual id
> ----------------------------------------
>         1        |        1
>         1        |        2
>         1        |        3
>         2        |        1
>         2        |        2
>         3        |        1
>         3        |        2
>
> Dataset 2
> -----------------------------------------------------------
> household id | household characteristic id
> ------------------------------------------------------------
>         1        |                 1
>         1        |                 3
>         1        |                 7
>         1        |                11
>         2        |                 1
>         2        |                 8
>         3        |                 2
>         3        |                 7
>         3        |                13
>
>
> On Wed, Jun 30, 2010 at 1:40 PM, Conor Hughes <cbhughes@uchicago.edu> wrote:
>> Hi All,
>> I have a couple of survey datasets that I need to merge, but they're
>> organized in an inconvenient way.  The first is organized by
>> household, and individuals within the household.  The second is only
>> organized by household.  I'd like to do a many-to-one merge on
>> household, so as to preserve the individual id's.  However, in the
>> second dataset, rather than adding household characteristics as
>> variables, it adds them as observations, e.g.:
>>
>> Dataset 1                                                          Dataset 2
>> -------------------------------------
>> -----------------------------------------------------------
>> household id | individual id                        household id |
>> household characteristic id
>> -------------------------------------
>> ------------------------------------------------------------
>>          1        |        1
>> 1        |            1
>>          1        |        2
>> 1        |            3
>>          1        |        3
>> 1        |            7
>>          2        |        1
>> 1        |            11
>>          2        |        2
>> 2        |             1
>>          3        |        1
>> 2        |             8
>>          3        |        2
>> 3        |             2
>>
>> 3        |             7
>>
>> 3        |             13
>> I'd prefer, in the second dataset, to have one observation for each
>> household, including household characteristics as dummy variables.  As
>> it is, the only way to get them together is via many-to-many merge,
>> which is foolish and doesn't work well, giving an output like
>> -------------------------------------------------------------------------------
>> household id | individual id | household characteristic id
>> -------------------------------------------------------------------------------
>>          1        |        1         |            1
>>          1        |        2         |            3
>>          1        |        3         |            7
>>          1        |        3         |            11
>>          2        |        1         |             1
>>          2        |        2         |             8
>>          3        |        1         |             2
>>          3        |        2         |             7
>>          3        |        2         |            13
>>    Which messes up the the first dataset, since it creates repeat
>> observations of individuals.  Is there a graceful way of the changing
>> the multiple observations per household in the second dataset to one
>> observation per household with characteristics represented as dummy
>> variables?  Any help would be greatly appreciated.  And please let me
>> know if I've described the situation poorly and you'd like
>> clarification.
>>
>> Cheers,
>> Conor
>>
>
>
>
> --
> Conor Hughes
> Mathematics and Economics
> University of Chicago 2011
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index