Stata 15 help for mark

[P] mark -- Mark observations for inclusion


Create marker variable after syntax

marksample lmacname [, novarlist strok zeroweight noby]

Create marker variable

mark newmarkvar [if] [in] [weight] [, zeroweight noby]

Modify marker variable

markout markvar [varlist] [, strok sysmissok]

Find range containing selected observations

markin [if] [in] [, name(lclname) noby]

Modify marker variable based on survey-characteristic variables

svymarkout markvar

aweights, fweights, iweights, and pweights are allowed; see weight. varlist may contain time-series operators; see tsvarlist.


marksample, mark, and markout are for use in Stata programs. marksample and mark are alternatives; marksample links to information left behind by syntax, and mark is seldom used. Both create a 0/1 to-use variable that records which observations are to be used in subsequent code. markout sets the to-use variable to 0 if any variables in varlist contain missing and is used to further restrict observations.

markin is for use after marksample, mark, and markout and, sometimes, provides a more efficient encoding of the observations to be used in subsequent code. markin is rarely used.

svymarkout sets the to-use variable to 0 wherever any of the survey-characteristic variables contain missing values; it is discussed in [SVY] svymarkout and is not further discussed here.


novarlist is for use with marksample. It specifies that missing values among variables in varlist not cause the marker variable to be set to 0. Specify novarlist if you previously specified

syntax newvarlist ...


syntax newvarname ...

You should also specify novarlist when missing values are not to cause observations to be excluded (perhaps you are analyzing the pattern of missing values).

strok is used with marksample or markout. Specify this option if string variables in varlist are to be allowed. strok changes rule 6 in Remarks below to read

"The marker variable is set to 0 in observations for which any of the string variables in varlist contain ""."

zeroweight is for use with marksample or mark. It deletes rule 1 in Remarks below, meaning that observations will not be excluded because the weight is zero.

noby is used rarely and only in byable(recall) programs. It specifies that, in identifying the sample, the restriction to the by-group be ignored. mark and marksample are to create the marker variable as they would had the user not specified the by prefix. If the user did not specify the by prefix, specifying noby has no effect. noby provides a way for byable(recall) programs to identify the overall sample. For instance, if the program needed to calculate the percentage of observations in the by-group, the program would need to know both the sample to be used on this call and the overall sample. The program might be coded as

program ..., byable(recall) ... marksample touse marksample alluse, noby

... quietly count if `touse' local curN = r(N) quietly count if `alluse' local totN = r(N)

local frac = `curN'/`totN' ... end

See [P] byable.

sysmissok is used with markout. Specify this option if numeric variables in varlist equal to system missing (.) are to be allowed and only numeric variables equal to extended missing (.a, .b, ...) are to be excluded. The default is that all missing values (., .a, .b, ...) are excluded.

name(lclname) is for use with markin. It specifies the name of the macro to be created. If name() is not specified, the name in is used.


Regardless of whether you use mark or marksample, followed or not by markout, the following rules apply:

1. The marker variable is set to 0 in observations for which weight is 0 (but see option zeroweight).

2. The appropriate error message is issued, and everything stops if weight is invalid (such as being less than 0 in some observation or being a noninteger for frequency weights).

3. The marker variable is set to 0 in observations for which the if exp is not satisfied.

4. The marker variable is set to 0 in observations outside the in range.

5. The marker variable is set to 0 in observations for which any of the numeric variables in varlist contain a numeric missing value.

6. The marker variable is set to 0 in all observations if any of the variables in varlist are strings; see option strok for an exception.

7. The marker variable is set to 1 in the remaining observations.

Using the name touse is a convention, not a rule, but it is recommended for consistency between programs.


program ... syntax ... marksample touse ... ... if `touse' ... ... end

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index