Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Extracting substrings from variable and combining variables. |

Date |
Mon, 4 Jun 2012 10:20:38 +0100 |

Previously I wrote " I don't know exactly what you want, so that rules out further suggestions from me for the time being. You would get better help by giving examples of what the variables you want would look like." You've not done this. All that I can pick up here is that you want to combine variables. I don't know what that "combining" means. So, this is another (but final) attempt from me to help. Note that -regexm()- and -regexs()- are functions, not commands. This is not just a piece of pedantry as (1) referring to functions as commands may confuse at least some readers, and clarifies nothing (2) thinking of these, always, as functions helps reminds everyone that they are defined and documented distinctly. It seems that you have variables -mdiag1-mdiag8- and wish to extract diagnoses "O1", "637", "642". You expect those diagnoses to be leading substrings. You can create a new composite variable this way. gen anydiag = "" foreach diag in O1 637 642 { forval j = 1/8 { local len = length("`diag'") replace anydiag = anydiag + "`diag'" if substr(mdiag`j', 1, `len') == "`diag'" } } But we've already gone over similar ideas in this thread. I don't think you ever said why you can't work from that resulting composite variable. You can create new indicator variables this way gen hasO1 = 0 gen has637 = 0 gen has642 = 0 forval j = 1/8 { replace hasO1 = 1 if hasO1 == 0 & substr(mdiag`j', 1, 2) == "O1" replace has637 = 1 if has637 == 0 & substr(mdiag`j', 1, 3) == "637" replace has642 = 1 if has642 == 0 & substr(mdiag`j', 1, 3) == "642" } This can be done with regex machinery too as a matter of taste. Nick On Mon, Jun 4, 2012 at 9:42 AM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote: > Originally, I started using the 'regex' command to extract ICD codes from my variables of interest shown below (mdiag1, mdiag2, mdiag3, mdiag4 etc....). I'm extracting the same ICD codes from all the mdiag variables starting with the numbers/letters: 637, 642 and O1. Initially I extracted the ICD codes from each mdiag variable separately with the idea of combining them at the end. But that seems a bit more complicated now. Maybe, one solution could be to extract all ICD codes from all mdiag variables at the same time. There are 12 such mdiag variables. > > gen preght1 = regexs(0) if regexm(mdiag1, "^(637|642|O1)") > tab preght1 > > gen preght2 = regexs(0) if regexm(mdiag2, "^(637|642|O1)") > tab preght2 > > gen preght3 = regexs(0) if regexm(mdiag3, "^(637|642|O1)") > tab preght3 > > gen preght4 = regexs(0) if regexm(mdiag4, "^(637|642|O1)") > tab preght4 > > gen preght5 = regexs(0) if regexm(mdiag5, "^(637|642|O1)") > tab preght5 > > gen preght6 = regexs(0) if regexm(mdiag6, "^(637|642|O1)") > tab preght6 > > gen preght7 = regexs(0) if regexm(mdiag7, "^(637|642|O1)") > tab preght7 > > gen preght8 = regexs(0) if regexm(mdiag8, "^(637|642|O1)") > tab preght8 > > The above generates 8 preght variables and works great. > > Initially I tried to combine the (mdiagX, "^(637|642|O1) for each mdiag variable by enclosing them in separate brackets one after another. But it doesn't work. How do I modify the regexs/regexm commands to be able to tell Stata to pluck out the ICD codes for several variables in the same command line? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**References**:**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**Re: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <njcoxstata@gmail.com>

**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**RE: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <n.j.cox@durham.ac.uk>

**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**RE: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <n.j.cox@durham.ac.uk>

**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

- Prev by Date:
**st: Using regexs/regexm to extract subjects with certain ICD codes.** - Next by Date:
**RE: st: Creating a dummy variable under certain conditions** - Previous by thread:
**RE: st: Extracting substrings from variable and combining variables.** - Next by thread:
**RE: st: Extracting substrings from variable and combining variables.** - Index(es):