Search
   >> Home >> Resources & support >> FAQs >> A little flavor of the new reshape command

Note: This FAQ is for users of Stata 5, an older version of Stata. It is not relevant for more recent versions.

Stata 5: What is the new reshape command? The updated (28 Jan 1998) reshape command

Title   Stata 5: A little flavor of the new reshape command
Author William Gould, StataCorp
Date January 1998

First of all, you can obtain this new version of reshape by updating the Stata ado-files.

Once you have obtained and installed the latest ado-file update, you can type

 . help reshape

to obtain the full documentation. Here are the highlights:

You can view the data as a collection of observations Xij. One such collection might be

          (wide form)                          (long form)
 
 -i-        ------- Xij --------         -i-  -j-          -Xij-
 id  sex   inc80   inc81   inc82         id   year   sex    inc
 -------------------------------         ----------------------
  1    0    5000    5500    6000          1     80     0   5000
  2    1    2000    2200    3300          1     81     0   5500
  3    0    3000    2000    1000          1     82     0   6000
                                          2     80     1   2000
                                          2     81     1   2200
                                          2     82     1   3300
                                          3     80     0   3000
                                          3     81     0   2000
                                          3     82     0   1000

Using the new reshape, you can convert one form to the other by typing

 . reshape long inc, i(id) j(year)      (goes from left-form to right)
 . reshape wide inc, i(id) j(year)      (goes from right-form to left)

In this example, one observation is, at least logically speaking,

 +-------- in the wide form -------+        +------ in the long ------+
 | . list if id==1                 |        | . list if id==1         |
 |                                 |        |                         |
 |    id  sex  inc80  inc81  inc82 |   OR   |    id  sex  year    inc |
 | 1.  1    0   5000   5500   6000 |        | 1.  1    0    80   5000 |
 + --------------------------------+        | 2.  1    0    81   5500 |
                                            | 3.  1    0    82   6000 |
                                            + ------------------------+

and you want to think of this single “observation” as Xij.

The i variable denotes the logical observation and is often called the group identifier. In our data, i is the variable id.

j denotes the subobservation, so it is often called the subgroup or within-group identifier. j is year in our data, or at least, variable year when the data are in the long form. There is no j variable in the wide form. Instead, the inc variable is suffixed with the values of j, forming inc80, inc81, and inc82.

That leaves only the variable sex, which we did not specify when we typed

 . reshape long inc, i(id) j(year)
 . reshape wide inc, i(id) j(year)

Since sex was not specified, sex was assumed to be constant within i, and reshape verified this assumption before converting the data. There is no limit to the number of constant-within-i variables, and you do not have to explictly specify them. reshape now assumes the unmentioned variables are constant and notifies you if this is incorrect.

The syntax of reshape is

reshape {wide|long} X_ij-variables, i(i-variable) j(j-variable)

Here is an example with two Xij variables with the data in wide form:

 . list

    id  sex   inc80   inc81   inc82   ue80   ue81  ue82
1.   1    0    5000    5500    6000      0      1     0
2.   2    1    2000    2200    3300      1      0     0
3.   3    0    3000    2000    1000      0      0     1
 

To convert these data into long form, type

  . reshape long inc ue, i(id) j(year)
 

Note that there is no variable named year in our original wide dataset. year will be a new variable in our long dataset. After conversion, we will have

 . list
 
     id   year   sex     inc    ue
 1.   1     80     0    5000     0
 2.   1     81     0    5500     1
 3.   1     82     0    6000     0
 <output omitted>
 9.   3     82     0    1000     1

Similarly, if we took this dataset and typed

 . reshape wide inc ue, i(id) j(year)
 

we would be back to our original data:

 . list
 
     id  sex   inc80   inc81   inc82   ue80   ue81  ue82
 1.   1    0    5000    5500    6000      0      1     0
 2.   2    1    2000    2200    3300      1      0     0
 3.   3    0    3000    2000    1000      0      0     1

Converting from wide to long creates the j (year) variable.

Converting from long to wide drops the j (year) variable.

The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube