Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Amal Khanolkar <Amal.Khanolkar@ki.se> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Extracting substrings from variable and combining variables. |

Date |
Mon, 4 Jun 2012 08:42:44 +0000 |

Hi, Originally, I started using the 'regex' command to extract ICD codes from my variables of interest shown below (mdiag1, mdiag2, mdiag3, mdiag4 etc....). I'm extracting the same ICD codes from all the mdiag variables starting with the numbers/letters: 637, 642 and O1. Initially I extracted the ICD codes from each mdiag variable separately with the idea of combining them at the end. But that seems a bit more complicated now. Maybe, one solution could be to extract all ICD codes from all mdiag variables at the same time. There are 12 such mdiag variables. gen preght1 = regexs(0) if regexm(mdiag1, "^(637|642|O1)") tab preght1 gen preght2 = regexs(0) if regexm(mdiag2, "^(637|642|O1)") tab preght2 gen preght3 = regexs(0) if regexm(mdiag3, "^(637|642|O1)") tab preght3 gen preght4 = regexs(0) if regexm(mdiag4, "^(637|642|O1)") tab preght4 gen preght5 = regexs(0) if regexm(mdiag5, "^(637|642|O1)") tab preght5 gen preght6 = regexs(0) if regexm(mdiag6, "^(637|642|O1)") tab preght6 gen preght7 = regexs(0) if regexm(mdiag7, "^(637|642|O1)") tab preght7 gen preght8 = regexs(0) if regexm(mdiag8, "^(637|642|O1)") tab preght8 The above generates 8 preght variables and works great. Initially I tried to combine the (mdiagX, "^(637|642|O1) for each mdiag variable by enclosing them in separate brackets one after another. But it doesn't work. How do I modify the regexs/regexm commands to be able to tell Stata to pluck out the ICD codes for several variables in the same command line? Thanks, Amal. ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Nick Cox [n.j.cox@durham.ac.uk] Sent: 01 June 2012 19:31 To: 'statalist@hsphsun2.harvard.edu' Subject: RE: st: Extracting substrings from variable and combining variables. There is a tutorial on -foreach- at SJ-2-2 pr0005 . . . . . . Speaking Stata: How to face lists with fortitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q2/02 SJ 2(2):202--222 (no commands) demonstrates the usefulness of for, foreach, forvalues, and local macros for interactive (non programming) tasks . search foreach will get you to a clickable link. If you want a composite variable, you can use -egen- and then modify the resulting variable with -replace- to get what you want. Or you can write your own code to get what you want. I don't know exactly what you want, so that rules out further suggestions from me for the time being. You would get better help by giving examples of what the variables you want would look like. Nick n.j.cox@durham.ac.uk Amal Khanolkar Thanks for the input and code: I didn't really understand what the code does (''for each etc...'') But it does pluck out the those that have the 3 diagnoses of interest and creates 3 separate variables as follows: tab has637 has637 | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,969,464 99.26 99.26 1 | 21,992 0.74 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 . tab has642 has642 | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,948,590 98.57 98.57 1 | 42,866 1.43 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 . tab hasO1 hasO1 | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,968,084 99.22 99.22 1 | 23,372 0.78 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 - The above also gives a lower number and skips those recorded as duplicates. - I think using the replace command to restructure preght is probably easier: however you meant I do it before that is using the original 12 variables and skipping egen all together? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**Re: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <njcoxstata@gmail.com>

**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**RE: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <n.j.cox@durham.ac.uk>

**RE: st: Extracting substrings from variable and combining variables.***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**RE: st: Extracting substrings from variable and combining variables.***From:*Nick Cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Creating a dummy variable under certain conditions** - Next by Date:
**st: Using regexs/regexm to extract subjects with certain ICD codes.** - Previous by thread:
**RE: st: Extracting substrings from variable and combining variables.** - Next by thread:
**Re: st: Extracting substrings from variable and combining variables.** - Index(es):