help recode dialog: recode
-------------------------------------------------------------------------------
Title
[D] recode -- Recode categorical variables
Syntax
Basic syntax
recode varlist (rule) [(rule) ...] [, generate(newvar)]
Full syntax
recode varlist (erule) [(erule) ...] [if] [in] [, options]
where the most common forms for rule are
+----------------------------------------------------------+
| rule | Example | Meaning |
|----------------+-------------+---------------------------|
| # = # | 3 = 1 | 3 recoded to 1 |
| # # = # | 2 . = 9 | 2 and . recoded to 9 |
| #/# = # | 1/5 = 4 | 1 through 5 recoded to 4 |
| nonmissing = # | nonmiss = 8 | all other nonmissing to 8 |
| missing = # | miss = 9 | all other missings to 9 |
+----------------------------------------------------------+
where erule has the form
element [element ...] = el ["label"]
nonmissing = el ["label"]
missing = el ["label"]
else | * = el ["label"]
element has the form
el | el/el
and el is
# | min | max
The keyword rules missing, nonmissing, and else must be the last rules
specified. else may not be combined with missing or nonmissing.
options description
-------------------------------------------------------------------------
Options
generate(newvar) generate newvar containing transformed variables;
default is to replace existing variables
prefix(str) generate new variables with str prefix
label(name) specify a name for the value label defined by the
transformation rules
copyrest copy out-of-sample values from original variables
test test that rules are invoked and do not overlap
-------------------------------------------------------------------------
Menu
Data > Create or change data > Other variable-transformation commands >
Recode categorical variable
Description
recode changes the values of numeric variables according to the rules
specified. Values that do not meet any of the conditions of the rules
are left unchanged, unless an otherwise rule is specified.
A range #1/#2 refers to all (real and integer) values between #1 and #2,
including the boundaries #1 and #2. This interpretation of #1/#2 differs
from that in numlists.
min and max provide a convenient way to refer to the minimum and maximum
for each variable in varlist and may be used in both the from-value and
the to-value parts of the specification. Combined with if and in, the
minimum and maximum are determined over the restricted dataset.
The keyword rules specify transformations for values not changed by the
previous rules:
nonmissing all nonmissing values not changed by the rules
missing all missing values (., .a, .b,..., .z) not changed
by the rules
else all nonmissing and missing values not changed by the
rules
* synonym for else
recode provides a convenient way to define value labels for the generated
variables during the definition of the transformation, reducing the risk
of inconsistencies between the definition and value labeling of
variables. Value labels may be defined for integer values and for the
extended missing values (.a, .b,..., .z), but not for noninteger values
and or for sysmiss (.).
Although this is not shown in the syntax diagram, the parentheses around
the rules and keyword clauses are optional if you transform only a
variable and if you do not define value labels.
Options
+---------+
----+ Options +----------------------------------------------------------
generate(newvar) specifies the names of the variables that will contain
the transformed variables. into() is a synonym for generate().
Values outside the range implied by if or in are set to missing (.),
unless the copyrest option is specified.
If generate() is not specified, the input variables are overwritten;
values outside the if or in range are not modified. Overwriting
variables is dangerous (you cannot undo changes, value labels may be
wrong, etc.), so we strongly recommend specifying generate().
prefix(str) specifies that the recoded variables be returned in new
variables formed by prefixing the names of the original variables
with str.
label(name) specifies a name for the value label defined from the
transformation rules. label() may be defined only with generate()
(or its synonym, into()) and prefix(). If one variable is recoded,
the label name defaults to newvar unless a label with that name
already exists.
copyrest specifies that out-of-sample values be copied from the original
variables. In line with other data-management commands, recode
defaults to setting newvar to missing (.) outside the observations
selected by if exp and in range.
test specifies that Stata test whether rules are ever invoked or that
rules overlap; for example, (1/5=1) (3=2).
Examples
---------------------------------------------------------------------------
Setup
. webuse recxmpl
List the data
. list
For x, change 1 to 2, leave all other values unchanged, and store the
results in nx
. recode x (1 = 2), gen(nx)
List the result
. list x nx
For x1, swap 1 and 2, and store the results in nx1
. recode x1 (1 = 2) (2 = 1), gen(nx1)
List the result
. list x1 nx1
For x2, collapse 1 and 2 into 1, change 3 to 2, change 4 through 7 to 3,
and store the results in nx2
. recode x2 (1 2 = 1) (3 = 2) (4/7 = 3), gen(nx2)
List the result
. list x2 nx2
For x1, x2, and x3, change the direction of 1, 2, ..., 8, moving 8 to 1,
7 to 2, etc., and store the transformed variables in newx1, newx2, and
newx3
. recode x1-x3 (1=8) (2=7) (3=6) (4=5) (5=4) (6=3) (7=2) (8=1),
pre(new) test
List the result
list x1 newx1 x2 newx2 x3 newx3
---------------------------------------------------------------------------
Setup
. webuse fullauto, clear
For rep77 and rep78, collapse 1 and 2 into 1, change 3 to 2, collapse 4
and 5 into 3, store results in newrep77 and newrep78, and define a new
value label newrep
. recode rep77 rep78 (1 2 = 1 "Below average") (3 = 2 Average) (4 5 =
3 "Above average"), pre(new) label(newrep)
List the old and new value label
. label list repair newrep
List some of the data
. list *rep77 *rep78 in 1/10, nolabel
---------------------------------------------------------------------------
Tip: long recode commands may conveniently be written using the line
continuation ///. For example
. recode x y (1 2 = 1 low) ///
(3 = 2 medium) ///
(4 5 = 3 high) ///
(nonmissing = 9 "something else") ///
(missing = .) ///
, gen(Rx Ry) label(Cat3)
Also see
Manual: [D] recode
Help: [D] generate, [D] mvencode