Search
   >> Home >> Resources & support >> FAQs >> Avoiding -too many variables- in reshape

Note: This FAQ is for users of Stata 5, an older version of Stata. It is not relevant for more recent versions.

Stata 5: Why does reshape give a too-many-variables error?

Title   Stata 5: Avoiding “too many variables” when using reshape
Author James Hardin, StataCorp
Date January 1996; updated January 1998

You type

 . reshape groups year 90-95
 . reshape vars inc
 . reshape cons id sex age race hgt wgt shoesize hatsize ms kids addr
 too many variables 
 r(103);

reshape cons allows a maximum of 10 variables to be specified, and there are 11 in the above example. This problem has been resolved in the updated version of reshape. The new version of reshape also has a simpler syntax and other nice features. In addition, the new version of reshape understands the old reshape syntax, so your prior do-files and ado-files will still work.

Click here to learn about how to obtain the updated version of reshape.

Nevertheless, here is how to work around the limitation in the old version of reshape:

Step 1. Create demogs.dta containing the demographic variables.

 . use yourdata
 . keep id sex age race hgt wgt shoesize hatsize ms kids addr
 . sort id
 . quietly by id: keep if _n==1
 . save demogs, replace 

The quietly by id: keep if _n==1 is necessary only if your data are in the long form, but it will not hurt in any case.

In the above example, we assume that variable id is enough to uniquely identify each observation. If two variables are required (e.g., hospital-id and patient-id), substitute those two variable names for id.

Step 2. Reshape the data using "reshape cons id"

 . reshape groups year 90-95
 . reshape vars inc
 . reshape cons id
 . reshape wide   or   reshape long

That is, reshape normally, but note the shorter reshape cons statement.

Step 3. Merge the demographic data

 . sort id
 . merge id using demogs
 . keep if _merge==3
 . drop _merge

In these lines, we merge back in the nonvarying characteristics from demogs.dta.

The keep if _merge==3 is not really necessary, but we recommend it. In the solution as given, keep if _merge==3 will do nothing because _merge must be 3. If you form demogs.dta one day, however, and then reshape on a subset of your data another day, the keep if _merge==3 is important.

That is all there is to dealing with dealing with this problem.

Why the problem arises, if you care

One of the steps performed internally by reshape is a match merge, using the reshape cons variables to match observations. While Stata allows you to sort on any number of variables, the maximum number of key variables in a match merge is 10. Hence, the limitation.

The reshape cons variables include those variables that (1) uniquely identify the subjects, and (2) do not vary within subject but that you want carried along. To fix the problem on our end, we should change reshape so that list (1) is given with some new reshape id command, which would continue to be limited to 10 variables—and list, and (2) is given by reshape cons and is not limited.

The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube