Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Extracting substrings from variables.


From   Amal Khanolkar <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Extracting substrings from variables.
Date   Fri, 25 May 2012 14:26:01 +0000

Hi again,

It works now! I forgot to specify the '=1' in the gen command.

However doing this the two ways (using gen with inlist and the regexs commands) I get slightly different numbers which shouldn't be the case....


. gen ht=1 if inlist(substr(mdiag1, 1, 3), "637", "642") | substr(mdiag1,1, 2) == "O1"
(2951413 missing values generated)

. tab ht

         ht |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |     40,043      100.00      100.00
------------+-----------------------------------
      Total |     40,043      100.00


. gen preght1 = regexs(0) if regexm(mdiag1, "^637|642|O1")

. tab preght1

    preght1 |      Freq.     Percent        Cum.
------------+-----------------------------------
        637 |      8,314       20.62       20.62
        642 |     21,537       53.42       74.05
         O1 |     10,462       25.95      100.00
------------+-----------------------------------
      Total |     40,313      100.00


Both ht & preght are the same variables above (or atleast should be the same - not sure what's causing the difference of 270!)


I also tried to combine/merge the many variables of preght I created all including the same diagnostic codes but from different time periods (named preght 1, preght2, 3, 4, 5, 6 etc....) using the egen command with the concat function - but it doesn't give me the right numbers - any other command that would do the job better? 

/Amal.

______
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: 25 May 2012 16:04
To: [email protected]
Subject: Re: st: Extracting substrings from variables.

Yes; my idea is that one of your parentheses ( or ) was missing! I've
rechecked my example and it looks OK.

if inlist(substr(m1diagx, 1, 3), "637", "642") | substr(m1diagx,
1, 2) == "O1"

Stata is just like elementary algebra: parentheses () brackets [] and
braces { } must all occur in pairs. You don't show us your code, and
so you need to count for yourself.

Nick

On Fri, May 25, 2012 at 2:31 PM, Amal Khanolkar <[email protected]> wrote:
> Thanks Brendan - it worked like a charm!  :)
>
> Nick - I tried your way using 'inlist' however I kept getting an error message that one bracket was missing - I tried several ways to try and solve the issue - but was unable to do so - any ideas?
>
> I agree with both of you - regexs can be annoying esp for me who came across it for the first time today :)
>
>
> Thanks!
>
> /Amal.
>
>
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Brendan Halpin [[email protected]]
> Sent: 25 May 2012 14:07
> To: [email protected]
> Subject: Re: st: Extracting substrings from variables.
>
> On Fri, May 25 2012, Brendan Halpin wrote:
>
>> On Fri, May 25 2012, Amal Khanolkar wrote:
>>
>>> gen preght = regexs(0) if regexm(mdiag1x, "[^637] | [^642] | [^O1]")
>>
>> A quick and untested suggestion:
>>
>> . gen preght = regexs(0) if regexm(mdiag1x, "^(637)|(642)|(O1)")
>
> On testing, it seems the grouping parentheses are not necessary:
>
> ...................................................................
> . input str10 mdiag1x
>
>        mdiag1x
>  1.    "637 asdf"
>  2.    "638 asdf"
>  3.    "8637 asdf"
>  4.    "642 asdf"
>  5.    "O1 asdf"
>  6. end
>
> . gen preght = regexs(0) if regexm(mdiag1x, "^637|642|O1")
> (2 missing values generated)
>
> . gen hasdiag = regexm(mdiag1x, "^637|642|O1")
>
> . list
>
>     +------------------------------+
>     |   mdiag1x   preght   hasdiag |
>     |------------------------------|
>  1. |  637 asdf      637         1 |
>  2. |  638 asdf                  0 |
>  3. | 8637 asdf                  0 |
>  4. |  642 asdf      642         1 |
>  5. |   O1 asdf       O1         1 |
>     +------------------------------+
> ...................................................................

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index