Stata 15 help for f_ustrlen

[FN] String functions

Functions

ustrlen(s) Description: the number of characters in the Unicode string s

An invalid UTF-8 sequence is counted as one Unicode character. An invalid UTF-8 sequence may contain one byte or multiple bytes. Note that any Unicode character beyond the plain ASCII range (code point greater than 127) takes more than 1 byte in the UTF-8 encoding; for example, é takes 2 bytes.

ustrlen("médiane") = 7 strlen("médiane") = 8 Domain s: Unicode strings Range: integers > 0

ustrinvalidcnt(s) Description: the number of invalid UTF-8 sequences in s

An invalid UTF-8 sequence may contain one byte or multiple bytes.

ustrinvalidcnt("médiane") = 0 ustrinvalidcnt("médiane"+char(229)) = 1 ustrinvalidcnt("médiane"+char(229)+char(174)) = 1 ustrinvalidcnt("médiane"+char(174)+char(158)) = 2 Domain s: Unicode strings Range: integers


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index