help destring, help tostring dialogs: destring tostring
-------------------------------------------------------------------------------
Title
[D] destring -- Convert string variables to numeric variables and vice
versa
Syntax
Convert string variables to numeric variables
destring [varlist] , {generate(newvarlist)|replace} [
destring_options]
Convert numeric variables to string variables
tostring varlist , {generate(newvarlist)|replace} [tostring_options]
destring_options description
-------------------------------------------------------------------------
* generate(newvarlist) generate newvar_1, ..., newvar_k for each
variable in varlist
* replace replace string variables in varlist with numeric
variables
ignore("chars") remove specified nonnumeric characters
force convert nonnumeric strings to missing values
float generate numeric variables as type float
percent convert percent variables to fractional form
dpcomma convert variables with commas as decimals to
period-decimal format
-------------------------------------------------------------------------
* Either generate(newvarlist) or replace is required.
tostring_options description
-------------------------------------------------------------------------
* generate(newvarlist) generate newvar_1, ..., newvar_k for each
variable in varlist
* replace replace numeric variables in varlist with string
variables
force force conversion ignoring information loss
format(format) convert using specified format
usedisplayformat convert using display format
-------------------------------------------------------------------------
* Either generate(newvarlist) or replace is required.
Menu
destring
Data > Create or change data > Other variable-transformation commands
> Convert variables from string to numeric
tostring
Data > Create or change data > Other variable-transformation commands
> Convert variables from numeric to string
Description
destring converts variables in varlist from string to numeric. If
varlist is not specified, destring will attempt to convert all variables
in the dataset from string to numeric. Characters listed in ignore() are
removed. Variables in varlist that are already numeric will not be
changed. destring treats both empty strings "" and "." as indicating
sysmiss (.) and interprets the strings ".a", ".b", ..., ".z" as the
extended missing values .a, .b, ..., .z; see [U] 12.2.1 Missing values.
destring also ignores any leading or trailing spaces so that, for
example, " " is equivalent to "" and " . " is equivalent to ".".
tostring converts variables in varlist from numeric to string. The most
compact string format possible is used. Variables in varlist that are
already string will not be converted.
Options for destring
Either generate() or replace must be specified. With either option, if
any string variable contains nonnumeric characters not specified with
ignore(), then no corresponding variable will be generated, nor will that
variable be replaced (unless force is specified).
generate(newvarlist) specifies that a new variable be created for each
variable in varlist. newvarlist must contain the same number of new
variable names as there are variables in varlist. If varlist is not
specified, destring attempts to generate a numeric variable for each
variable in the dataset; newvarlist must then contain the same number
of new variable names as there are variables in the dataset. Any
variable labels or characteristics will be copied to the new
variables created.
replace specifies that the variables in varlist be converted to numeric
variables. If varlist is not specified, destring attempts to convert
all variables from string to numeric. Any variable labels or
characteristics will be retained.
ignore("chars") specifies nonnumeric characters to be removed. If any
string variable contains any nonnumeric characters other than those
specified with ignore(), no action will take place for that variable
unless force is also specified. Note that to Stata the comma is a
nonnumeric character; see also the dpcomma option below.
force specifies that any string values containing nonnumeric characters,
in addition to any specified with ignore(), be treated as indicating
missing numeric values.
float specifies that any new numeric variables be created initially as
type float. The default is type double; see [D] data types.
destring attempts automatically to compress each new numeric variable
after creation.
percent removes any percent signs found in the values of a variable, and
all values of that variable are divided by 100 to convert the values
to fractional form. percent by itself implies that the percent sign,
"%", is an argument to ignore(), but the converse is not true.
dpcomma specifies that variables with commas as decimal values should be
converted to have periods as decimal values.
Options for tostring
Either generate() or replace must be specified. If converting any
numeric variable to string would result in loss of information, no
variable will be produced unless force is specified. For more details,
see force below.
generate(newvarlist) specifies that a new variable be created for each
variable in varlist. newvarlist must contain the same number of new
variable names as there are variables in varlist. Any variable
labels or characteristics will be copied to the new variables
created.
replace specifies that the variables in varlist be converted to string
variables. Any variable labels or characteristics will be retained.
force specifies that conversions be forced even if they entail loss of
information. Loss of information means one of two circumstances: 1)
The result of real(string(varname, "format")) is not equal to
varname, i.e., the conversion is not reversible without loss of
information; 2) replace was specified, but a variable has associated
value labels. In circumstance 1, it is usually best to specify
usedisplayformat or format(). In circumstance 2, value labels will
be ignored in a forced conversion. decode (see [D] encode) is the
standard way to generate a string variable based on value labels.
format(format) specifies that a numeric format be used as an argument to
the string() function, which controls the conversion of the numeric
variable to string. For example, a format of %7.2f specifies that
numbers are to be rounded to two decimal places before conversion to
string. See [D] functions and [D] format. format() cannot be
specified with usedisplayformat.
usedisplayformat specifies that the current display format be used for
each individual variable. For example, this option could be useful
when using U.S. Social Security numbers. usedisplayformat cannot be
specified with format().
Examples
---------------------------------------------------------------------------
Setup
. webuse destring1
. describe
. list
Generate numeric variables from the string variables
. destring, generate(id2 num2 code2 total2 income2)
Describe the result
. describe
List the result
. list
---------------------------------------------------------------------------
Setup
. webuse destring1, clear
. describe
. list
Convert string variables to numeric variables, replacing the original
string variables
. destring, replace
Describe the result
. describe
List the result
. list
---------------------------------------------------------------------------
Setup
. webuse destring2, clear
. describe date
. list date
Remove the spaces in date and convert it to a numeric variable, replacing
the original string variable
. destring date, ignore(" ") replace
Describe the result
. describe
List the result
. list
---------------------------------------------------------------------------
Setup
. webuse tostring, clear
. describe
. list
Convert the numeric variables year and day to string variables, replacing
the original string variables
. tostring year day, replace
Describe the result
. describe
List the result
. list
---------------------------------------------------------------------------
Also see
Manual: [D] destring
Help: [D] egen, [D] encode, [D] functions, [D] generate, [D] split