__Using Stata/SE__

There are three flavors of Stata:

Flavor Description
-----------------------------------------------
**Stata/IC** standard version
-> **Stata/SE** Stata/IC + large datasets
**Stata/MP** Stata/SE + parallel processing
-----------------------------------------------
See **[U] 5 Flavors of Stata** for descriptions

To determine which flavor of Stata you are running, type

**. about**

If you are using a different flavor of Stata, click on the appropriate
link:

-----------------------------------------------
**Stata/IC** Using Stata/IC
**Stata/MP** Using Stata/MP
-----------------------------------------------

For information on upgrading to Stata/SE or Stata/MP, point your browser
to http://www.stata.com.

__Contents__

1. Starting Stata/SE

2. Setting Stata/SE's limits
2.1 Advice on setting maxvar
2.2 Advice on setting matsize

3. Sharing .dta datasets with non-SE users

4. Querying memory usage

5. Advice to programmers
5.1 Determining flavor
5.2 Avoid macro shift in program loops

__1. Starting Stata/SE__

You start Stata/SE in much the same way as you start Stata/IC:

Windows:
Select **Start > All Programs > Stata 15.1 > StataSE 15.1**

Mac:
Double-click the file **Stata.do** from the **data** folder, or
double-click the **StataSE** icon from the **Stata** folder.

Unix:
At the Unix command prompt, type **xstata-se** to invoke the GUI
version of Stata/SE, or type **stata-se** to invoke the console
version.

__2. Setting Stata/SE's limits__

The two limits for Stata/SE are as follows:

1. **maxvar**
The maximum number of variables allowed in a dataset. This
limit is initially set to 5,000; you can increase it up to
32,767.

2. **matsize**
The maximum size of matrices, or said differently, the
maximum number of independent variables allowed in the
models that you fit. This limit is initially set to 400,
and you can increase it up to 11,000.

You reset the limits by using the

**set maxvar** *#* [**,** __perm__**anently**]
**set matsize** *#* [**,** __perm__**anently**]

commands. For instance, you might type

**. set maxvar 6000**
**. set matsize 900**

The order in which you set the limits does not matter. If you specify
the **permanently** option when you set a limit, in addition to making the
change for the present session, Stata/SE will remember the new limit and
use it in the future when you invoke Stata/SE:

**. set maxvar 6000, permanently**
**. set matsize 900, permanently**

You can reset the current or permanent limits whenever and as often as
you wish.

__2.1 Advice on setting maxvar__

**set maxvar** *#* [**,** __perm__**anently**] 2,048 <= *#* <= 32,767

Why is there a limit on **maxvar**? Why not just set **maxvar** to 32,767 and be
done with it? Because simply allowing room for variables, even if they
do not exist, consumes memory, and if you will be using only datasets
with a lot fewer variables, you will be wasting memory.

For instance, if you set **maxvar** to 20,000, you would consume
approximately 14 more megabytes than if you left **maxvar** at the default.
That's not a huge amount of memory, but there is no need to waste it.

**Recommendation**: Think about datasets with the most variables that
you typically use. Set **maxvar** to a few hundred or even 1,000 above
that. (The memory cost of an extra 1,000 variables is about 1 MB.)

**Remember**, you can always reset **maxvar** temporarily by typing **set**
**maxvar** *#*.

__2.2 Advice on setting matsize__

**set matsize** *#* [**,** __perm__**anently**] 10 <= *#* <= 11,000

The name **matsize** is unfortunate because it suggests something that is
only partially true. It suggests that the maximum size of matrices is
**matsize** *x* **matsize**. **matsize**, however, is irrelevant for the size of
matrices in Mata, Stata's modern matrix-programming language. Regardless
of the value of **matsize**, Mata matrices be larger or smaller than that.

**matsize** specifies the maximum size of matrices in Stata's old matrix
language -- and that is not of great importance -- and it specifies the
maximum number of variables that may appear in Stata's estimation
commands -- and that is important. A better name for **matsize** would be
**modelsize**.

With that introduction, let us begin.

Although **matsize** can theoretically be set up to 11,000, on all but the
largest 64-bit computers you will be unable to do that, and even if you
succeeded, Stata/SE would probably run out of memory. The value of
**matsize** has a dramatic effect on memory usage, the formula being

Number of megabytes = (8***matsize**^2 + 88***matsize**)/(1024^2)

For instance,

+--------------------------+
| **matsize** | Memory use |
|-----------+--------------|
| 400 | 1.254M |
| 800 | 4.950M |
| 1,600 | 19.666M |
| 3,200 | 78.394M |
| 6,400 | 313.037M |
| 11,000 | 924.080M |
+--------------------------+

The formula, in fact, understates the amount of memory certain Stata
commands use and understates what you will use yourself if you use
Stata's old matrix language matrices directly. The formula gives the
amount of memory required for one matrix and 11 vectors. If two matrices
are required, the numbers above are nearly doubled. When you **set**
**matsize**, Stata will refuse if you specify too large a value, but remember
that even if Stata does not complain, you still may run into problems
later. Stata might be running some statistical command and then
complain, "op. sys. refuses to provide memory; r(909)".

For **matsize**=11,000, nearly 1 GB of memory is required, and doubling that
would require nearly 2 GB of memory. On most 32-bit computers, 2 GB is
the most memory that the operating system will allocate to one task, so
nearly nothing would be left for the rest of Stata.

Why, then, is **matsize** allowed to be set so large? Because on 64-bit
computers, such large amounts cause no difficulty.

For reasonable values of **matsize** (say, up to 3,200), memory consumption
is not too great. Choose a reasonable value given the kinds of models
you fit, and remember that you can always reset the value.

__3. Sharing .dta datasets with non-SE users__

You may share datasets with Stata/MP users with no changes necessary.
You may share datasets with Stata/IC users as long as your dataset does
not have more variables than are allowed in those flavors of Stata. See
limits.

__4. Querying memory usage__

The command

**. memory**

will display the current memory report and the command

**. query memory**

will display the current memory settings. See help memory.

__5. Advice to programmers__

__5.1 Determining flavor__

Programmers can determine which flavor of Stata is running by examining
the creturn values

creturn values
| **c(flavor) c(SE) c(MP)**
------------+------------------------------
Stata/IC | "**IC**" 0 0
Stata/SE | "**IC**" 1 0
Stata/MP | "**IC**" 1 1
-------------------------------------------

__5.2 Avoid macro shift in program loops__

**macro shift** has negative performance implications when used with variable
lists containing 20,000 or more variables. We recommend avoiding the use
of **macro shift** in loops and instead using either **foreach** or "double
indirection". Double indirection means referring to **``i''** when **`i'**
contains a number 1, 2, ....