|Title||Running Stata in a clustered environment|
|Author||Kevin Crow, StataCorp|
|Date||November 2007; updated October 2014|
A computer cluster is a group of loosely coupled computers that closely work together so that, in many respects, they can be thought of as a single computer. The components of a cluster are commonly connected to each other through fast local-area networks. Clusters are most often used to provide greater performance and availability than that provided by a single computer and are typically more cost-effective than single computers of comparable speed and availability.
A cluster of multiple computers is different from a single computer with multiple processors or cores.
StataCorp sells Stata/MP, which is a version of Stata providing extensive support for parallel computations on multiple-processor and multiple-core computers. StataCorp does not sell a version of Stata that supports parallel computations across multiple computers (cluster-processing capabilities).
Some sites have written custom scripts for running computationally intensive tasks, such as simulations, that allow running multiple copies of Stata simultaneously in a clustered environment. Typically, one master Stata task runs on one of the computers in the cluster, hands out batch jobs to other instances of Stata running on the other computers in the cluster, and collects the results.
If each of the computers is a multiprocessor computer, Stata/MP may be used to perform parallel computations on each computer. You can read more about Stata/MP here.
If more than one person will be using Stata simultaneously on a cluster, a network license for the number of simultaneous users is required.