Thank you.
At 02:25 PM 2/20/2007, you wrote:
Fred Wolfe wrote:
Stata documents the increase in processing speed for estimation
commands, but only for a few non-estimation commands (generate,
replace). Does Stata Corp have data on performance of functions,
foreach, forvalues, merge, egens, etc. We reserve one computer for
data management and interface between SQL and Stata, and were
wondering how much increase in performance we would see for data
management tasks. We are thinking of 64 bit Vista.
Thanks.
Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel +1 316 263 2125
fwolfe@arthritis-research.org
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
We concentrated our efforts on estimation commands because they
benefited the most from parallel processing. For non-estimation
commands, we parallelized -generate-, -replace-, -by generate-, and
-by replace-. We did not parallelize commands like -foreach- and
-forvalues- because they can be essentially sequential, i.e., one
iteration depends on the previous iterations. Any attempt to
parallelize -foreach- and -forvalues- would require a new directive
to mark if the loop can be parallelized or not in the do/ado
code. (Our design explicitly ruled out such markers. We imposed
the restriction that the do/ado code be identical for Stata/MP and Stata/SE.)
Unfortunately, if your task spends most of the time moving data back and
forth from an SQL sever to Stata, it might not benefit much from Stata/MP.
Some parts of -merge- and some -egen- commands can be parallelized.
But we feel the bottleneck of -merge- is the disk IO instead of
Stata internal processing. To the extent that an -egen- command
uses -generate- and -replace-, it is already parallelized. For some
group-level -egen- commands, a higher degree of parallelization can
be obtained by using the equivalent -by group: generate...- commands.
--Hua
hpeng@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel +1 316 263 2125
fwolfe@arthritis-research.org
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/