Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re:st: Speed with large panel datasets


From   Christopher Baum <[email protected]>
To   "[email protected]" <[email protected]>
Subject   re:st: Speed with large panel datasets
Date   Mon, 21 Mar 2011 16:48:43 -0400

<>
Gordon said

forval i=1/`npanel' {
     arima depvar indvar1 indvar2 ...  if np=`i' & <some other condition>
}


This will run very slowly on a large panel dataset, regardless of the command executed, because it uses an if condition to pick out the obs belonging to a panel. It would run faster if you used an in condition that referenced the observations in each panel. If it is a balanced panel, it is easy to compute those mechanically; if it is not, it only takes a couple of commands to identify the start and end of each panel. This may still not beat the reshape approach you're using, but it has been documented previously that if conditions on a large panel run much more slowly because they have to consider whether each of, e.g., a million obs. belong to this panel, and you know exactly which do and which don't, and can specify that as an in condition.

KIt



Kit Baum   |   Boston College Economics and DIW Berlin   |   http://ideas.repec.org/e/pba1.html
An Introduction to Stata Programming   |   http://www.stata-press.com/books/isp.html
An Introduction to Modern Econometrics Using Stata   |   http://www.stata-press.com/books/imeus.html


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index