Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Reshaping data question


From   Phil Schumm <pschumm@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Reshaping data question
Date   Tue, 8 Apr 2008 04:44:59 -0500

On Apr 7, 2008, at 9:27 PM, Glenn Hoetker wrote:
The data is in this layout (with fictional data)

COUNTRY    DATA_SERIES  YR1960  YR1961  YR1962
Argentina  GDP          5       7       9
Argentina  Population   10      8       5
Brazil     GDP          1       9       10
Brazil     Population   5       35      12

I would like to get it into this format:

COUNTRY       YEAR        GDP     POPULATION
Argentina     1960        5               10
Argentina     1961        7                8
...
Brazil        1960        1                5
Brazil        1961        9               35
etc.

The trick in this case is to go "fully long" and then come back to wide (but WRT the data series dimension rather than the time dimension):


. li

+---------------------------------------------------+
| country data_ser~s yr1960 yr1961 yr1962 |
|---------------------------------------------------|
1. | Argentina GDP 5 7 9 |
2. | Argentina Population 10 8 5 |
3. | Brazil GDP 1 9 10 |
4. | Brazil Population 5 35 12 |
+---------------------------------------------------+

. reshape long yr, i(country data_series) j(year)
<output omitted>

. reshape wide yr, i(country year) j(data_series) string
<output omitted>

. ren yrGDP GDP

. ren yrPopulation Population

. li

+-----------------------------------+
| country year GDP Popula~n |
|-----------------------------------|
1. | Argentina 1960 5 10 |
2. | Argentina 1961 7 8 |
3. | Argentina 1962 9 5 |
4. | Brazil 1960 1 5 |
5. | Brazil 1961 9 35 |
|-----------------------------------|
6. | Brazil 1962 10 12 |
+-----------------------------------+


The general principle here is to start by making your data as long as possible; from that point, it's usually easier to see how to get to where you want to go. Note that the code above assumes that data_series is a string variable; if not, you'll need to make appropriate adjustments.


-- Phil

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index