Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: Combining multiple observations into one observation with multiple variables


From   Conor Hughes <cbhughes@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Combining multiple observations into one observation with multiple variables
Date   Wed, 30 Jun 2010 16:11:14 +0700

Hi Sven,
Thanks for your reply, that did the trick perfectly.  I'd never heard
of the --reshape-- command before.

Thanks again,
Conor

2010/6/30 Sven-Oliver Spieß <svenoliverspiess@gmail.com>:
> Hi Conor
>
>
> Generally 'reshape' would do that. Does the following example point in
> the right direction?
>
> ===================================
> input id char
> 1 1
> 1 3
> 1 7
> 1 11
> 2 1
> 2 8
> 3 2
> 3 7
> 3 13
> end
>
> bysort id (char): gen count = _n
> reshape wide char, i(id) j(count)
> list
> ===================================
>
>
>
> Best,
> Sven-Oliver
>
>
> On Wed, Jun 30, 2010 at 09:06, Conor Hughes <cbhughes@uchicago.edu> wrote:
>> Sorry, my tables got smushed:
>> Dataset1
>> ----------------------------------------
>> household id | individual id
>> ----------------------------------------
>>         1        |        1
>>         1        |        2
>>         1        |        3
>>         2        |        1
>>         2        |        2
>>         3        |        1
>>         3        |        2
>>
>> Dataset 2
>> -----------------------------------------------------------
>> household id | household characteristic id
>> ------------------------------------------------------------
>>         1        |                 1
>>         1        |                 3
>>         1        |                 7
>>         1        |                11
>>         2        |                 1
>>         2        |                 8
>>         3        |                 2
>>         3        |                 7
>>         3        |                13
>>
>>
>> On Wed, Jun 30, 2010 at 1:40 PM, Conor Hughes <cbhughes@uchicago.edu> wrote:
>>> Hi All,
>>> I have a couple of survey datasets that I need to merge, but they're
>>> organized in an inconvenient way.  The first is organized by
>>> household, and individuals within the household.  The second is only
>>> organized by household.  I'd like to do a many-to-one merge on
>>> household, so as to preserve the individual id's.  However, in the
>>> second dataset, rather than adding household characteristics as
>>> variables, it adds them as observations, e.g.:
>>>
>>> Dataset 1                                                          Dataset 2
>>> -------------------------------------
>>> -----------------------------------------------------------
>>> household id | individual id                        household id |
>>> household characteristic id
>>> -------------------------------------
>>> ------------------------------------------------------------
>>>          1        |        1
>>> 1        |            1
>>>          1        |        2
>>> 1        |            3
>>>          1        |        3
>>> 1        |            7
>>>          2        |        1
>>> 1        |            11
>>>          2        |        2
>>> 2        |             1
>>>          3        |        1
>>> 2        |             8
>>>          3        |        2
>>> 3        |             2
>>>
>>> 3        |             7
>>>
>>> 3        |             13
>>> I'd prefer, in the second dataset, to have one observation for each
>>> household, including household characteristics as dummy variables.  As
>>> it is, the only way to get them together is via many-to-many merge,
>>> which is foolish and doesn't work well, giving an output like
>>> -------------------------------------------------------------------------------
>>> household id | individual id | household characteristic id
>>> -------------------------------------------------------------------------------
>>>          1        |        1         |            1
>>>          1        |        2         |            3
>>>          1        |        3         |            7
>>>          1        |        3         |            11
>>>          2        |        1         |             1
>>>          2        |        2         |             8
>>>          3        |        1         |             2
>>>          3        |        2         |             7
>>>          3        |        2         |            13
>>>    Which messes up the the first dataset, since it creates repeat
>>> observations of individuals.  Is there a graceful way of the changing
>>> the multiple observations per household in the second dataset to one
>>> observation per household with characteristics represented as dummy
>>> variables?  Any help would be greatly appreciated.  And please let me
>>> know if I've described the situation poorly and you'd like
>>> clarification.
>>>
>>> Cheers,
>>> Conor
>>>
>>
>>
>>
>> --
>> Conor Hughes
>> Mathematics and Economics
>> University of Chicago 2011
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Conor Hughes
Mathematics and Economics
University of Chicago 2011

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index