Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Manipulation of string variable using -regexm- |

Date |
Sat, 12 Oct 2013 17:43:51 +0100 |

This corrects a typo (sorry). Note that the definition of the local macro is essential for this to work, although it could be rewritten to avoid that. local end substr(CurrRtg, 2, .) gen var_star = (`end' == "*") + 2 * (`end' == "*.") + 3 * (`end' =="*+") assert `end' == "" if var_star == 0 Nick njcoxstata@gmail.com On 12 October 2013 14:46, STOLOWY, Herve <stolowy@hec.fr> wrote: > Dear Nick: > > After > > gen var_star = (`end' == "*") + 2 * (`end' == "*.") + 3 * (`end' ="*+") > > I get an error message: > > unknown function () > > Best regards > > Hervé > > > On Sat, Oct 12, 2013 at 12:25 AM, Nick Cox <njcoxstata@gmail.com> wrote: >> Note also other solutions such as >> >> local end substr(CurrRtg, 2, .) >> gen var_star = (`end' == "*") + 2 * (`end' == "*.") + 3 * (`end' ="*+") >> assert `end' == "" if var_star == 0 >> >> Nick >> njcoxstata@gmail.com >> >> >> On 11 October 2013 21:59, Federico Belotti <f.belotti@gmail.com> wrote: >>> Dear Herve >>> >>> my suggestion is to use the command -screening-, a Stata's user-written string variables exploring and recoding tool. >>> You need to search and install it using >>> >>> findit screening >>> >>> Once installed, the syntax you are looking for to obtain a new numeric variable equal to 0 if not star, 1 if only *, 2 if *- and 3 if *+ is the following >>> >>> screening, source(CurrRtg, upper) key(end "\*" end "\*-" end "\*\+" end "[A-Z]") new(mark, numeric) recode(1 "1" 2 "2" 3 "3" 4 "0") >>> >>> where >>> >>> 1) the option -source()- specifies the source variable that have to be recoded (note the suboption -upper- which allows to perform a case-insensitive match (uppercase)); >>> 2) the option -key()- specifies the keywords you are looking for (in this case represented by regular expressions); >>> 3) the option -new()- specifies the name of the new variable to be created (in this case, I called it "mark". Note the suboption -numeric- that allows to get the newly created variable as a numeric variable); >>> 4) the option -recode()- specifies the user-defined coding scheme following the keywords order. >>> >>> See -help screening- for more details. >>> >>> Hope this helps. >>> Federico >>> >>> >>> On Oct 11, 2013, at 6:40 PM, STOLOWY, Herve wrote: >>> >>>> Dear Statalisters: >>>> >>>> Using Stata 12.1, I want to extract a portion of a string variable using >>>> regular expressions, i.e. -regexs- and -regexm-. >>>> >>>> My string variable has different possible values. Example: >>>> >>>> A >>>> A * >>>> A *- >>>> A *+ >>>> B >>>> B * >>>> B *- >>>> B *+ >>>> etc. >>>> >>>> I would like to get a variable with the content filled with the * or *- or >>>> *+ or with this type of coding: >>>> >>>> 0 if not star >>>> 1 if only * >>>> 2 if *- >>>> 3 if *+ >>>> >>>> The * or *- or *+ always appear at the end on the value. >>>> >>>> I tried the following syntax: >>>> >>>> gen var_star =3D regexs(0) if(regexm(CurrRtg, "\*" "\*+" "\*-")) >>>> >>>> Unfortunately, I get a * in all cases there is a * included in the value, >>>> but I do not get the *- or *+. >>>> >>>> I have difficulties with the syntax of -regexm-. >>>> >>>> There is maybe another way to get the same result. >>>> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Manipulation of string variable using -regexm-***From:*"STOLOWY, Herve" <stolowy@hec.fr>

**Re: st: Manipulation of string variable using -regexm-***From:*Roberto Ferrer <refp16@gmail.com>

**References**:**st: Manipulation of string variable using -regexm-***From:*"STOLOWY, Herve" <stolowy@hec.fr>

**Re: st: Manipulation of string variable using -regexm-***From:*Federico Belotti <f.belotti@gmail.com>

**Re: st: Manipulation of string variable using -regexm-***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Manipulation of string variable using -regexm-***From:*"STOLOWY, Herve" <stolowy@hec.fr>

- Prev by Date:
**Re: st: Manipulation of string variable using -regexm-** - Next by Date:
**st: Arellano Bond (xtabond) with test for weak instruments** - Previous by thread:
**Re: st: Manipulation of string variable using -regexm-** - Next by thread:
**Re: st: Manipulation of string variable using -regexm-** - Index(es):