Stata 15 help for SpecialEdition

Using Stata/SE

There are three flavors of Stata:

Flavor Description ----------------------------------------------- Stata/IC standard version -> Stata/SE Stata/IC + large datasets Stata/MP Stata/SE + parallel processing ----------------------------------------------- See [U] 5 Flavors of Stata for descriptions

To determine which flavor of Stata you are running, type

. about

If you are using a different flavor of Stata, click on the appropriate link:

----------------------------------------------- Stata/IC Using Stata/IC Stata/MP Using Stata/MP -----------------------------------------------

For information on upgrading to Stata/SE or Stata/MP, point your browser to http://www.stata.com.

Contents

1. Starting Stata/SE

2. Setting Stata/SE's limits 2.1 Advice on setting maxvar 2.2 Advice on setting matsize

3. Sharing .dta datasets with non-SE users

4. Querying memory usage

5. Advice to programmers 5.1 Determining flavor 5.2 Avoid macro shift in program loops

1. Starting Stata/SE

You start Stata/SE in much the same way as you start Stata/IC:

Windows: Select Start > All Programs > Stata 15.1 > StataSE 15.1

Mac: Double-click the file Stata.do from the data folder, or double-click the StataSE icon from the Stata folder.

Unix: At the Unix command prompt, type xstata-se to invoke the GUI version of Stata/SE, or type stata-se to invoke the console version.

2. Setting Stata/SE's limits

The two limits for Stata/SE are as follows:

1. maxvar The maximum number of variables allowed in a dataset. This limit is initially set to 5,000; you can increase it up to 32,767.

2. matsize The maximum size of matrices, or said differently, the maximum number of independent variables allowed in the models that you fit. This limit is initially set to 400, and you can increase it up to 11,000.

You reset the limits by using the

set maxvar # [, permanently] set matsize # [, permanently]

commands. For instance, you might type

. set maxvar 6000 . set matsize 900

The order in which you set the limits does not matter. If you specify the permanently option when you set a limit, in addition to making the change for the present session, Stata/SE will remember the new limit and use it in the future when you invoke Stata/SE:

. set maxvar 6000, permanently . set matsize 900, permanently

You can reset the current or permanent limits whenever and as often as you wish.

2.1 Advice on setting maxvar

set maxvar # [, permanently] 2,048 <= # <= 32,767

Why is there a limit on maxvar? Why not just set maxvar to 32,767 and be done with it? Because simply allowing room for variables, even if they do not exist, consumes memory, and if you will be using only datasets with a lot fewer variables, you will be wasting memory.

For instance, if you set maxvar to 20,000, you would consume approximately 14 more megabytes than if you left maxvar at the default. That's not a huge amount of memory, but there is no need to waste it.

Recommendation: Think about datasets with the most variables that you typically use. Set maxvar to a few hundred or even 1,000 above that. (The memory cost of an extra 1,000 variables is about 1 MB.)

Remember, you can always reset maxvar temporarily by typing set maxvar #.

2.2 Advice on setting matsize

set matsize # [, permanently] 10 <= # <= 11,000

The name matsize is unfortunate because it suggests something that is only partially true. It suggests that the maximum size of matrices is matsize x matsize. matsize, however, is irrelevant for the size of matrices in Mata, Stata's modern matrix-programming language. Regardless of the value of matsize, Mata matrices be larger or smaller than that.

matsize specifies the maximum size of matrices in Stata's old matrix language -- and that is not of great importance -- and it specifies the maximum number of variables that may appear in Stata's estimation commands -- and that is important. A better name for matsize would be modelsize.

With that introduction, let us begin.

Although matsize can theoretically be set up to 11,000, on all but the largest 64-bit computers you will be unable to do that, and even if you succeeded, Stata/SE would probably run out of memory. The value of matsize has a dramatic effect on memory usage, the formula being

Number of megabytes = (8*matsize^2 + 88*matsize)/(1024^2)

For instance,

+--------------------------+ | matsize | Memory use | |-----------+--------------| | 400 | 1.254M | | 800 | 4.950M | | 1,600 | 19.666M | | 3,200 | 78.394M | | 6,400 | 313.037M | | 11,000 | 924.080M | +--------------------------+

The formula, in fact, understates the amount of memory certain Stata commands use and understates what you will use yourself if you use Stata's old matrix language matrices directly. The formula gives the amount of memory required for one matrix and 11 vectors. If two matrices are required, the numbers above are nearly doubled. When you set matsize, Stata will refuse if you specify too large a value, but remember that even if Stata does not complain, you still may run into problems later. Stata might be running some statistical command and then complain, "op. sys. refuses to provide memory; r(909)".

For matsize=11,000, nearly 1 GB of memory is required, and doubling that would require nearly 2 GB of memory. On most 32-bit computers, 2 GB is the most memory that the operating system will allocate to one task, so nearly nothing would be left for the rest of Stata.

Why, then, is matsize allowed to be set so large? Because on 64-bit computers, such large amounts cause no difficulty.

For reasonable values of matsize (say, up to 3,200), memory consumption is not too great. Choose a reasonable value given the kinds of models you fit, and remember that you can always reset the value.

3. Sharing .dta datasets with non-SE users

You may share datasets with Stata/MP users with no changes necessary. You may share datasets with Stata/IC users as long as your dataset does not have more variables than are allowed in those flavors of Stata. See limits.

4. Querying memory usage

The command

. memory

will display the current memory report and the command

. query memory

will display the current memory settings. See help memory.

5. Advice to programmers

5.1 Determining flavor

Programmers can determine which flavor of Stata is running by examining the creturn values

creturn values | c(flavor) c(SE) c(MP) ------------+------------------------------ Stata/IC | "IC" 0 0 Stata/SE | "IC" 1 0 Stata/MP | "IC" 1 1 -------------------------------------------

5.2 Avoid macro shift in program loops

macro shift has negative performance implications when used with variable lists containing 20,000 or more variables. We recommend avoiding the use of macro shift in loops and instead using either foreach or "double indirection". Double indirection means referring to ``i'' when `i' contains a number 1, 2, ....


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index