Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Reshape vs. merge: computing time?


From   Daniel Feenberg <[email protected]>
To   [email protected]
Subject   Re: st: Reshape vs. merge: computing time?
Date   Sat, 18 Jan 2014 16:34:00 -0500 (EST)


On Sat, 18 Jan 2014, Dorothy Bridges wrote:

Dear all: I have a data set in which 20 million individuals appear in
each of ten periods (~200 million observations total). The data are
long and I would like them to be wide. From a computing time / memory
usage perspective, am I better off (a) running reshape or (b) saving
each of the ten periods as a separate file and then merging on person
ID?

-reshape- is not as fast as one might expect, and wide to long can be
done much faster by reading the file multiple times. See the link
under "Rehape" at

  http://www.nber.org/stata/efficiency

But a similar trick to go long to wide won't work because -save- doesn't allow an -if- qualifier, meaning you would have to reread the whole file for each of the 10 saves. Also, -merge- is quite a bit slower than -use-.

Dan Feenberg

Thank you,
D
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index