help string functions
-------------------------------------------------------------------------------
Title
[D] functions -- Functions
Description
This is a quick reference for string functions. For help on all
functions, see [D] functions.
String functions
abbrev(s,n)
Domain s: strings
Domain n: 5 to 32
Range: strings
Description: returns s, abbreviated to n characters.
If any of the characters of s are a period, ".", and n <
8, then the value n defaults to a value of 8.
Otherwise, if n < 5, then n defaults to a value of 5.
If n is missing, abbrev() will return the entire string
s. abbrev() is typically used with variable names and
variable names with time-series operators (the period
case). abbrev("displacement",8) is displa~t.
char(n)
Domain: integers 1 to 255
Range: ASCII characters
Description: returns the character corresponding to ASCII code n.
returns "" if n is not in the domain.
indexnot(s1,s2)
Domain s1: strings (to be searched)
Domain s2: strings of individual characters (to search for)
Range: integers 0 to 244
Description: returns the position in s1 of the first character of s1
not found in s2, or 0 if all characters of s1 are
found in s2.
itrim(s)
Domain: strings
Range: strings with no multiple, consecutive internal blanks
Description: returns s with multiple, consecutive internal blanks
collapsed to one blank.
itrim("hello there") = "hello there"
length(s)
Domain: strings
Range: integers 0 to 244
Description: returns the length of s.
length("ab") = 2
lower(s)
Domain: strings
Range: strings with lowercased characters
Description: returns the lowercased variant of s.
lower("THIS") = "this"
ltrim(s)
Domain: strings
Range: strings without leading blanks
Description: returns s without leading blanks.
ltrim(" this") = "this"
plural(n,s) or plural(n,s1,s2)
Domain n: real numbers
Domain s: strings
Domain s1: strings
Domain s2: strings
Range: strings
Description: returns the plural of s, or s1 in the 3-argument case,
if n != +/-1. The plural is formed by adding "s" to
s if you called plural(n,s). If you called
plural(n,s1,s2) and s2 begins with the character
"+", the plural is formed by adding the remainder of
s2 to s1. If s2 begins with the character "-", the
plural is formed by subtracting the remainder of s2
from s1. If s2 begins with neither "+" nor "-",
then the plural is formed by returning s2.
returns s, or s1 in the 3-argument case, if n = +/-1.
plural(1, "horse") = "horse"
plural(2, "horse") = "horses"
plural(2, "glass", "+es") = "glasses"
plural(1, "mouse", "mice") = "mouse"
plural(2, "mouse", "mice") = "mice"
plural(2, "abcdefg", "-efg") = "abcd"
proper(s)
Domain: strings
Range: strings
Description: returns a string with the first letter capitalized, and
capitalizes any other letters immediately following
characters that are not letters; all other letters
converted to lowercase.
proper("mR. joHn a. sMitH") = "Mr. John A. Smith"
proper("jack o'reilly") = "Jack O'Reilly"
proper("2-cent's worth") = "2-Cent'S Worth"
real(s)
Domain: strings
Range: -8e+307 to 8e+307 and missing
Description: returns s converted to numeric, or returns missing.
real("5.2")+1 = 6.2
real("hello") = .
regexm(s,re)
Domain s: strings
Domain re: regular expression
Range: strings
Description: performs a match of a regular expression and evaluates
to 1 if regular expression re is satisfied by the
string s, otherwise returns 0. Regular expression
syntax is based on Henry Spencer's NFA algorithm and
this is nearly identical to the POSIX.2 standard.
regexr(s1,re,s2)
Domain s1: strings
Domain re: regular expression
Domain s2: strings
Range: strings
Description: replaces the first substring within s1 that matches re
with s2 and returns the resulting string. If s1
contains no substring that matches re, the unaltered
s1 is returned.
regexs(n)
Domain: 0 to 9
Range: strings
Description: returns subexpression n from a previous regexm() match,
where 0 < n < 10. Subexpression 0 is reserved for
the entire string that satisfied the regular
expression.
reverse(s)
Domain: strings
Range: reversed strings
Description: returns s reversed.
reverse("hello") = "olleh"
rtrim(s)
Domain: strings
Range: strings without trailing blanks
Description: returns s without trailing blanks: rtrim("this ") =
"this".
soundex(s)
Domain: strings
Range: strings
Description: returns the soundex code for a string, s. The soundex
code consists of a letter followed by three numbers:
the letter is the first letter of the name and the
numbers encode the remaining consonants. Similar
sounding consonants are encoded by the same number.
soundex("Ashcraft") = "A226"
soundex("Robert") = "R163"
soundex("Rupert") = "R163"
soundex_nara(s)
Domain: strings
Range: strings
Description: returns the U.S. Census soundex code for a string, s.
The soundex code consists of a letter followed by
three numbers: the letter is the first letter of the
name and the numbers encode the remaining
consonants. Similar sounding consonants are encoded
by the same number.
soundex_nara("Ashcraft") = "A261"
string(n)
Domain: -8e+307 to 8e+307 and missing
Range: strings
Description: returns n converted to a string:
string(4)+"F" = "4F"
string(1234567) = "1234567"
string(12345678) = "1.23e+07"
string(.) = "."
string(n,s)
Domain n: -8e+307 to 8e+307 and missing
Domain s: strings containing %fmt numeric display format
Range: strings
Description: returns n converted to a string:
string(4,"%9.2f") = "4.00"
string(123456789,"%11.0g") = "123456789"
string(123456789,"%13.0gc" = "123,456,789"
string(0,"%td") = "01jan1960"
string(225,"%tq") = "2016q2"
string(225,"not a format") = ""
strlen(s) is a synonym for length(s).
strlower(x) is a synonym for lower(x).
strltrim(x) is a synonym for ltrim(x).
strmatch(s1,s2)
Domain: strings
Range: 0 or 1
Description: returns 1 if s1 matches the pattern s2; otherwise, it
returns 0. strmatch("17.4","1??4") returns 1. In
s2, "?" means that one character goes here, and "*"
means that zero or more characters go here. Also
see regexm(), regexr(), and regexs().
strofreal(n) is a synonym for string(n).
strofreal(n,s) is a synonym for string(n,s).
strpos(s1,s2)
Domain s1: strings (to be searched)
Domain s2: strings (to search for)
Range: integers 0 to 244
Description: returns the position in s1 at which s2 is first found;
otherwise, it returns 0.
strpos("this","is") = 3
strpos("this","it") = 0
strproper(x) is a synonym for proper(x).
strreverse(x) is a synonym for reverse(x).
strrtrim(x) is a synonym for rtrim(x).
strtoname(s,p)
Domain s: strings
Domain p: 0 or 1
Range: strings
Description: returns s translated into a Stata name. Each character
in s that is not allowed in a Stata name is
converted to an underscore character, _. If the
first character in s is a numeric character and p is
not 0, then the result is prefixed with an
underscore. The result is truncated to 32
characters.
strtoname("name",1) = "name"
strtoname("a name",1) = "a_name"
strtoname("5",1) = "_5"
strtoname("5:30",1) = "_5_30"
strtoname("5",0) = "5"
strtoname("5:30",0) = "5_30"
strtoname(s)
Domain s: strings
Range: strings
Description: returns s translated into a Stata name. Each character
in s that is not allowed in a Stata name is
converted to an underscore character, _. If the
first character in s is a numeric character, then
the result is prefixed with an underscore. The
result is truncated to 32 characters.
strtoname("name") = "name"
strtoname("a name") = "a_name"
strtoname("5") = "_5"
strtoname("5:30") = "_5_30"
strtrim(x) is a synonym for trim(x).
strupper(x) is a synonym for upper(x).
subinstr(s1,s2,s3,n)
Domain s1: strings (to be substituted into)
Domain s2: strings (to be substituted from)
Domain s3: strings (to be substituted with)
Domain n: integers 0 to 244 and missing
Range: strings
Description: returns s1, where the first n occurrences in s1 of s2
have been replaced with s3. If n is missing, all
occurrences are replaced. Also see regexm(),
regexr(), and regexs().
subinstr("this is this","is","X",1) = "thX is this"
subinstr("this is this","is","X",2) = "thX X this"
subinstr("this is this","is","X",.) = "thX X thX"
subinword(s1,s2,s3,n)
Domain s1: strings (to be substituted for)
Domain s2: strings (to be substituted from)
Domain s3: strings (to be substituted with)
Domain n: integers 0 to 244 and missing
Range: strings
Description: returns s1, where the first n occurrences in s1 of s2 as
a word have been replaced with s3. A word is
defined as a space-separated token. A token at the
beginning or end of s1 is considered space
separated. If n is missing, all occurrences are
replaced. Also see regexm(), regexr(), and
regexs().
subinword("this is this","is","X",1) = "this X this"
subinword("this is this","is","X",.) = "this X this"
subinword("this is this","th","X",.) = "this is
this"
substr(s,n1,n2)
Domain s: strings
Domain n1: integers 1 to 244 and -1 to -244
Domain n2: integers 1 to 244 and -1 to -244
Range: strings
Description: returns the substring of s, starting at column n1, for a
length of n2. If n1 < 0, n1 is interpreted as
distance from the end of the string; if n2 = .
(missing), the remaining portion of the string is
returned.
substr("abcdef",2,3) = "bcd"
substr("abcdef",-3,2) = "de"
substr("abcdef",2,.) = "bcdef"
substr("abcdef",-3,.) = "def"
substr("abcdef",2,0) = ""
substr("abcdef",15,2) = ""
trim(s)
Domain: strings
Range: strings without leading or trailing blanks
Description: returns s without leading and trailing blanks;
equivalent to ltrim(rtrim(s)). trim(" this ") =
"this"
upper(s)
Domain: strings
Range: strings with uppercased characters
Description: returns the uppercased variant of s. upper("this") =
"THIS"
word(s,n)
Domain s: strings
Domain n: integers ...,-2,-1,0,1,2,...
Range: strings
Description: returns the nth word in s. Positive numbers count words
from the beginning of s, and negative numbers count
words from the end of s. (1 is the first word in s,
and -1 is the last word in s.) Returns missing ("")
if n is missing.
wordcount(s)
Domain: strings
Range: nonnegative integers 0,1,2,...
Description: returns the number of words in s. A word is a set of
characters that start and terminate with spaces,
start with the beginning of the string, or terminate
with the end of the string.
Also see
Manual: [D] functions
Help: [D] destring, [D] encode, [D] egen