Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
re:st: Speed with large panel datasets
From
Christopher Baum <[email protected]>
To
"[email protected]" <[email protected]>
Subject
re:st: Speed with large panel datasets
Date
Mon, 21 Mar 2011 16:48:43 -0400
<>
Gordon said
forval i=1/`npanel' {
arima depvar indvar1 indvar2 ... if np=`i' & <some other condition>
}
This will run very slowly on a large panel dataset, regardless of the command executed, because it uses an if condition to pick out the obs belonging to a panel. It would run faster if you used an in condition that referenced the observations in each panel. If it is a balanced panel, it is easy to compute those mechanically; if it is not, it only takes a couple of commands to identify the start and end of each panel. This may still not beat the reshape approach you're using, but it has been documented previously that if conditions on a large panel run much more slowly because they have to consider whether each of, e.g., a million obs. belong to this panel, and you know exactly which do and which don't, and can specify that as an in condition.
KIt
Kit Baum | Boston College Economics and DIW Berlin | http://ideas.repec.org/e/pba1.html
An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html
An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/