Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Need help with transforming dataset, please


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Need help with transforming dataset, please
Date   Tue, 26 Feb 2013 08:12:26 +0000

Here is another way to do it, this time in place.

qui foreach s in Math English Geog French {
      gen `s' = .
      forval j = 1/4 {
             replace `s' = Grade`j' if Subject`j' == upper("`s'")
     }
}

drop Subject* Grade*

In a real example there might be subjects such as "Integrated Science"
with spaces or other characters that couldn't appear in variable
names. Replace all such characters by underscores, or remove them.

Nick

On Tue, Feb 26, 2013 at 7:52 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Note that this case is documented, at least in terms of an aphorism or
> slogan, within
>
> http://www.stata.com/support/faqs/data-management/problems-with-reshape/
>
> The slogan is "You may need two reshapes to get to where you want to be".
>
> Nick
>
> On Tue, Feb 26, 2013 at 12:32 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> . reshape long Subject Grade, i(StudId)
>> (note: j = 1 2 3 4)
>>
>> Data                               wide   ->   long
>> -----------------------------------------------------------------------------
>> Number of obs.                        4   ->      16
>> Number of variables                  10   ->       5
>> j variable (4 values)                     ->   _j
>> xij variables:
>>          Subject1 Subject2 ... Subject4   ->   Subject
>>                Grade1 Grade2 ... Grade4   ->   Grade
>> -----------------------------------------------------------------------------
>>
>> . drop if missing(Subject)
>> (2 observations deleted)
>>
>> . replace Subject = proper(Subject)
>> (14 real changes made)
>>
>> . drop _j
>>
>> . reshape wide Grade , string i(StudId) j(Subject)
>> (note: j = English French Geog Math)
>>
>> Data                               long   ->   wide
>> -----------------------------------------------------------------------------
>> Number of obs.                       14   ->       4
>> Number of variables                   4   ->       6
>> j variable (4 values)           Subject   ->   (dropped)
>> xij variables:
>>                                   Grade   ->   GradeEnglish
>> GradeFrench ... GradeMath
>> -----------------------------------------------------------------------------
>>
>> . renpfix Grade
>>
>> . l
>>
>>      +--------------------------------------------------+
>>      | StudId   English   French   Geog   Math   Gender |
>>      |--------------------------------------------------|
>>   1. |    121         4        5      5      3        F |
>>   2. |    122         4        .      5      5        F |
>>   3. |    123         3        2      3      1        M |
>>   4. |    124         2        .      1      2        M |
>>      +--------------------------------------------------+
>>
>> Here's the code in one
>>
>> reshape long Subject Grade, i(StudId)
>> drop if missing(Subject)
>> replace Subject = proper(Subject)
>> drop _j
>> reshape wide Grade , string i(StudId) j(Subject)
>> renpfix Grade
>> l
>>
>> Nick
>>
>> On Tue, Feb 26, 2013 at 12:18 AM, Bidemi Carrol <bidemi@gmail.com> wrote:
>>
>>> I would appreciate some help with transforming the dataset below.
>>> Reshape does not seem to work, but I may be applying it wrong.
>>>
>>> The original dataset is as follows (with many more grade and subject
>>> combinations):
>>>
>>> StudId Gender  Subject1 Grade1 Subject2  Grade2   Subject3  Grade3
>>> Subject4  Grade4
>>> 121    F           MATH       3        ENGLISH  4           GEOG     5
>>>         FRENCH  5
>>> 122    F           ENGLISH  4        GEOG      5           MATH      5
>>> 123    M          FRENCH   2        ENGLISH  3           MATH      1
>>>        GEOG      3
>>> 124    M          GEOG       1        MATH       2          ENGLISH  2
>>>
>>> I would like it to look like this:
>>>
>>> StudID  Gender  Math  English  French  Geog
>>> 121        F          3         4             5        5
>>> 122        F          5         4                       5
>>> 123        M         1         3             2         3
>>> 124        M          2        2                        1
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index