"STOLOWY, Herve" <stolowy@hec.fr>

statalist@hsphsun2.harvard.edu

Re: st: Manipulation of string variable using -regexm-

Sat, 12 Oct 2013 15:46:08 +0200

Dear Nick: After gen var_star = (`end' == "*") + 2 * (`end' == "*.") + 3 * (`end' ="*+") I get an error message: unknown function () Best regards Hervé On Sat, Oct 12, 2013 at 12:25 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Note also other solutions such as > > local end substr(CurrRtg, 2, .) > gen var_star = (`end' == "*") + 2 * (`end' == "*.") + 3 * (`end' ="*+") > assert `end' == "" if var_star == 0 > > Nick > njcoxstata@gmail.com > > > On 11 October 2013 21:59, Federico Belotti <f.belotti@gmail.com> wrote: >> Dear Herve >> >> my suggestion is to use the command -screening-, a Stata's user-written string variables exploring and recoding tool. >> You need to search and install it using >> >> findit screening >> >> Once installed, the syntax you are looking for to obtain a new numeric variable equal to 0 if not star, 1 if only *, 2 if *- and 3 if *+ is the following >> >> screening, source(CurrRtg, upper) key(end "\*" end "\*-" end "\*\+" end "[A-Z]") new(mark, numeric) recode(1 "1" 2 "2" 3 "3" 4 "0") >> >> where >> >> 1) the option -source()- specifies the source variable that have to be recoded (note the suboption -upper- which allows to perform a case-insensitive match (uppercase)); >> 2) the option -key()- specifies the keywords you are looking for (in this case represented by regular expressions); >> 3) the option -new()- specifies the name of the new variable to be created (in this case, I called it "mark". Note the suboption -numeric- that allows to get the newly created variable as a numeric variable); >> 4) the option -recode()- specifies the user-defined coding scheme following the keywords order. >> >> See -help screening- for more details. >> >> Hope this helps. >> Federico >> >> >> On Oct 11, 2013, at 6:40 PM, STOLOWY, Herve wrote: >> >>> Dear Statalisters: >>> >>> Using Stata 12.1, I want to extract a portion of a string variable using >>> regular expressions, i.e. -regexs- and -regexm-. >>> >>> My string variable has different possible values. Example: >>> >>> A >>> A * >>> A *- >>> A *+ >>> B >>> B * >>> B *- >>> B *+ >>> etc. >>> >>> I would like to get a variable with the content filled with the * or *- or >>> *+ or with this type of coding: >>> >>> 0 if not star >>> 1 if only * >>> 2 if *- >>> 3 if *+ >>> >>> The * or *- or *+ always appear at the end on the value. >>> >>> I tried the following syntax: >>> >>> gen var_star =3D regexs(0) if(regexm(CurrRtg, "\*" "\*+" "\*-")) >>> >>> Unfortunately, I get a * in all cases there is a * included in the value, >>> but I do not get the *- or *+. >>> >>> I have difficulties with the syntax of -regexm-. >>> >>> There is maybe another way to get the same result. >>> >>> Best regards >>> >>> Herve Stolowy >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> -- >> Federico Belotti, PhD >> Research Fellow >> Centre for Economics and International Studies >> University of Rome Tor Vergata >> tel/fax: +39 06 7259 5627 >> e-mail: federico.belotti@uniroma2.it >> web: http://www.econometrics.it >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

