Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: extracting a specific portion of a string
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: RE: extracting a specific portion of a string
Date
Thu, 17 Mar 2011 13:57:02 +0000
<>
On Mar 17, 2011, at 5:42 AM, <[email protected]>
wrote:
> Wouldn't it be simpler to rename "MOTHER'SBLOOD" "BLOOD,MATERNAL"?
>
> reg
>
I doubt it. It's probably a safe bet that Travis's string variable takes more than just the 9 values he showed us in the example.
If there were hundreds or thousands of values where "BLOOD" or "SERUM" were located later in the string, would you really want to write some form of "replace v1 = "BLOOD,MATERNAL" if v1== "MOTHER'SBLOOD" for every possible instance? Also, Travis may be interested in extracting more than just "BLOOD" AND "SERUM" (such as "LIPEMIC", "1ST", "SPECIMEN", etc.) which could become problematic if you start jumbling up the variable just to get "BLOOD" to the front of it. Better to leave the string variable in place and use string functions to flag observations that contain some substring of interest or extract substrings into other variables.
On Mar 17, 2011, at 4:04 AM, Nick Cox wrote:
> <snip>
> On a detail that might confuse: Eric used -index()- and -strpos()-. In
> essence, -index()- is the old name that still works, while -strpos()-
> is the new name. It's the same function underneath the names.
This is (at least) the second time Nick has been kind enough to remind me that strpos() is the modern version of index() -- old habits die hard. (http://www.stata.com/statalist/archive/2011-02/msg01111.html)
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
>
> -----Original Message-----
>> From: Eric Booth <[email protected]>
>> Sent: Mar 17, 2011 12:15 AM
>> To: "<[email protected]>" <[email protected]>
>> Subject: Re: st: RE: extracting a specific portion of a string
>>
>> <>
>>
>> On Mar 16, 2011, at 10:43 PM, Travis Coan wrote:
>>>
>>> I would take a look at the -substr- function -- typing 'help substr' should get you there.
>>>
>>
>> You should probably look at all the functions available in -help string_functions-.
>> Note that -substr- alone wouldn't return the desired result in this example, e.g.:
>>
>> **********************!
>> clear
>> inp str20(v1)
>> "BLOOD"
>> "BLOOD(LIPEMIC)"
>> "BLOOD(MODERATELYLY"
>> "BLOOD, 2ND SPECIMEN"
>> "BLOOD,1STSPECIMEN"
>> "BLOOD,2NDSPECIMEN"
>> "MOTHER'SBLOOD"
>> "SERUM,1STSPECIMEN"
>> "SERUM,2NDSPECIMEN"
>> end
>>
>> g v2 = substr(v1, 1, 5)
>> **note obs 7
>>
>> //using strpos and substr string functions//
>> g str10 v4 = ""
>> foreach x in "BLOOD" "SERUM" {
>> g v`x' = strpos(v1, "`x'")
>> replace v4 = substr(v1, v`x' , 5) if v`x'>0
>> }
>>
>> //using index//
>> g ind = 0
>> replace ind = 1 if index(v1, "BLOOD")
>> replace ind = 2 if index(v1, "SERUM")
>> la def ii 1 "Blood" 2 "Serum", modify
>> lab val ind ii
>> li
>> **********************!
>>
>> - Eric
>> __
>> Eric A. Booth
>> Public Policy Research Institute
>> Texas A&M University
>> [email protected]
>>
>>
>>>
>>>
>>> From: [email protected] [mailto:[email protected]] On Behalf Of Mendoza Aldana, Dr Jorge Antonio (WPRO)
>>> Sent: Wednesday, March 16, 2011 10:36 PM
>>> To: [email protected]
>>> Subject: st: extracting a specific portion of a string
>>>
>>> Dear all,
>>> My dataset has a string variable, from which I need a specific portion of it. The content of the variable is like:
>>>
>>> BLOOD
>>> BLOOD(LIPEMIC)
>>> BLOOD(MODERATELYLY
>>> BLOOD, 2ND SPECIMEN
>>> BLOOD,1STSPECIMEN
>>> BLOOD,2NDSPECIMEN
>>> MOTHER'SBLOOD
>>> SERUM,1STSPECIMEN
>>> SERUM,2NDSPECIMEN
>>>
>>> and I need to generate a new variable containing either "BLOOD" or "SERUM"
>>> I would appreciate very much if you can give me some hints on solving this.
>>> I'm using Stata 11.1 on a Windows XP machine
>>> Kind regards,
>>> Jorge
>>>
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/