Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reshape

From	Daniel Feenberg <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: reshape
Date	Tue, 7 Aug 2012 15:19:59 -0400 (EDT)


On Tue, 7 Aug 2012, Airey, David C wrote:

.
When reshaping datasets from wide to long with very many variables androws, is there any gain in speed of reshaping fewer variables or rowsand then later combining versus letting reshape do its thing on thewhole data set?


I don't know about that, but...

The reshape command is inexplicably slow. Take a dataset with variablesid, year and x2001-x2010. Then the command:


  reshape long x, i(id) j(year)

takes about 20 seconds per million observations. But you can write out aseparate file for each year of data, and then concatenate them into onelong dataset in about 2 seconds. For example:


  forvalues year = 2001/2010 {
    use id year x`year' using widedata,replace
    rename x`year' x
    save "/tmp/reshape`year'",replace
  }
  clear
  forvalues year = 2001/2010 {
    append using "/tmp/reshape`year'"
  }

Obviously, the additional code isn't worthwhile unless you havemulti-millions of observations, or are reshaping many times, but

sometimes that is what you have.

dan feenberg
NBER
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: reshape
  - From: "Airey, David C" <[email protected]>

Prev by Date: RE: st: syntax behaving differently
Next by Date: Re: st: Does xtreg (or xtivreg) assume equally spaced time points?
Previous by thread: st: reshape
Next by thread: st: rename
Index(es):
- Date
- Thread