Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Reshaping data question


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Reshaping data question
Date   Tue, 8 Apr 2008 12:00:03 +0100

The theme that you may need two -reshape-s is ventilated with examples
within

FAQ     . . . . . . . . . . . . . . . . . . . . . . . .  Problems with
reshape
        12/03   I am having problems with the reshape command. Can
                you give further guidance?
                http://www.stata.com/support/faqs/data/reshape3.html

An alternative to Phil's 

. ren yrGDP GDP

. ren yrPopulation Population

is just 

. renpfix yr 

The saving would be greater in a more extensive example with more
variables. 

Nick
n.j.cox@durham.ac.uk 

Phil Schumm

On Apr 7, 2008, at 9:27 PM, Glenn Hoetker wrote:
> The data is in this layout (with fictional data)
>
> COUNTRY    DATA_SERIES  YR1960  YR1961  YR1962
> Argentina  GDP          5       7       9
> Argentina  Population   10      8       5
> Brazil     GDP          1       9       10
> Brazil     Population   5       35      12
>
> I would like to get it into this format:
>
> COUNTRY       YEAR        GDP     POPULATION
> Argentina     1960        5               10
> Argentina     1961        7                8
> ...
> Brazil        1960        1                5
> Brazil        1961        9               35
> etc.


The trick in this case is to go "fully long" and then come back to  
wide (but WRT the data series dimension rather than the time dimension):


. li

      +---------------------------------------------------+
      |   country   data_ser~s   yr1960   yr1961   yr1962 |
      |---------------------------------------------------|
   1. | Argentina          GDP        5        7        9 |
   2. | Argentina   Population       10        8        5 |
   3. |    Brazil          GDP        1        9       10 |
   4. |    Brazil   Population        5       35       12 |
      +---------------------------------------------------+

. reshape long yr, i(country data_series) j(year)
<output omitted>

. reshape wide yr, i(country year) j(data_series) string
<output omitted>

. ren yrGDP GDP

. ren yrPopulation Population

. li

      +-----------------------------------+
      |   country   year   GDP   Popula~n |
      |-----------------------------------|
   1. | Argentina   1960     5         10 |
   2. | Argentina   1961     7          8 |
   3. | Argentina   1962     9          5 |
   4. |    Brazil   1960     1          5 |
   5. |    Brazil   1961     9         35 |
      |-----------------------------------|
   6. |    Brazil   1962    10         12 |
      +-----------------------------------+


The general principle here is to start by making your data as long as  
possible; from that point, it's usually easier to see how to get to  
where you want to go.  Note that the code above assumes that  
data_series is a string variable; if not, you'll need to make  
appropriate adjustments.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index