Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: extracting a specific portion of a string


From   Eric Booth <[email protected]>
To   "<[email protected]>" <[email protected]>
Subject   Re: st: RE: extracting a specific portion of a string
Date   Thu, 17 Mar 2011 14:15:36 +0000

<>
On Mar 17, 2011, at 9:10 AM, Travis Coan wrote:

> This is actually Jorge's string variable, not mine. Though, I have
> certainly enjoyed your and Nick's posts.

Ah yes, sorry about that Travis.

- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]

> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Eric Booth
> Sent: Thursday, March 17, 2011 9:57 AM
> To: <[email protected]>
> Subject: Re: st: RE: extracting a specific portion of a string
> 
> <>
> On Mar 17, 2011, at 5:42 AM, <[email protected]>
> wrote:
> 
>> Wouldn't it be simpler to rename "MOTHER'SBLOOD" "BLOOD,MATERNAL"?
>> 
>> reg
>> 
> 
> I doubt it.  It's probably a safe bet that Travis's string variable
> takes more than just the 9 values he showed us in the example. 
> If there were hundreds or thousands of values where "BLOOD" or "SERUM"
> were located later in the string, would you really want to write some
> form of "replace v1 = "BLOOD,MATERNAL" if v1== "MOTHER'SBLOOD" for every
> possible instance? Also, Travis may be interested in extracting more
> than just "BLOOD" AND "SERUM" (such as "LIPEMIC",  "1ST",  "SPECIMEN",
> etc.) which could become problematic if you start jumbling up the
> variable just to get "BLOOD" to the front of it.  Better to leave the
> string variable in place and use string functions to flag observations
> that contain some substring of interest or extract substrings into other
> variables.
> 
> 
> On Mar 17, 2011, at 4:04 AM, Nick Cox wrote:
>> <snip>
>> On a detail that might confuse: Eric used -index()- and -strpos()-. In
>> essence, -index()- is the old name that still works, while -strpos()-
>> is the new name. It's the same function underneath the names.
> 
> This is (at least) the second time Nick has been kind enough to remind
> me that strpos() is the modern version of index() -- old habits die
> hard.  (http://www.stata.com/statalist/archive/2011-02/msg01111.html)
> 
> - Eric
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> [email protected]
> Office: +979.845.6754
> 
> 
>> 
>> -----Original Message-----
>>> From: Eric Booth <[email protected]>
>>> Sent: Mar 17, 2011 12:15 AM
>>> To: "<[email protected]>"
> <[email protected]>
>>> Subject: Re: st: RE: extracting a specific portion of a string
>>> 
>>> <>
>>> 
>>> On Mar 16, 2011, at 10:43 PM, Travis Coan wrote:
>>>> 
>>>> I would take a look at the -substr- function -- typing 'help substr'
> should get you there.
>>>> 
>>> 
>>> You should probably look at all the functions available in -help
> string_functions-.  
>>> Note that -substr- alone wouldn't return the desired result in this
> example, e.g.:
>>> 
>>> **********************!
>>> clear
>>> inp str20(v1)
>>> "BLOOD"
>>> "BLOOD(LIPEMIC)"
>>> "BLOOD(MODERATELYLY"
>>> "BLOOD, 2ND SPECIMEN"
>>> "BLOOD,1STSPECIMEN"
>>> "BLOOD,2NDSPECIMEN"
>>> "MOTHER'SBLOOD"
>>> "SERUM,1STSPECIMEN"
>>> "SERUM,2NDSPECIMEN"
>>> end
>>> 
>>> g v2 = substr(v1, 1, 5)
>>> **note obs 7
>>> 
>>> //using strpos and substr string functions//
>>> g str10 v4 = ""
>>> foreach x in "BLOOD" "SERUM" {
>>> g v`x' = strpos(v1, "`x'")
>>> replace v4 = substr(v1, v`x' , 5) if v`x'>0
>>> }
>>> 
>>> //using index//
>>> g ind = 0
>>> replace ind = 1 if index(v1, "BLOOD")
>>> replace ind = 2 if index(v1, "SERUM")
>>> la def ii 1 "Blood" 2 "Serum", modify
>>> lab val ind ii
>>> li
>>> **********************!
>>> 
>>> - Eric
>>> __
>>> Eric A. Booth
>>> Public Policy Research Institute
>>> Texas A&M University
>>> [email protected]
>>> 
>>> 
>>>> 
>>>> 
>>>> From: [email protected]
> [mailto:[email protected]] On Behalf Of Mendoza
> Aldana, Dr Jorge Antonio (WPRO)
>>>> Sent: Wednesday, March 16, 2011 10:36 PM
>>>> To: [email protected]
>>>> Subject: st: extracting a specific portion of a string
>>>> 
>>>> Dear all,
>>>> My dataset has a string variable, from which I need a specific
> portion of it. The content of the variable is like:
>>>> 
>>>> BLOOD
>>>> BLOOD(LIPEMIC)
>>>> BLOOD(MODERATELYLY
>>>> BLOOD, 2ND SPECIMEN
>>>> BLOOD,1STSPECIMEN
>>>> BLOOD,2NDSPECIMEN
>>>> MOTHER'SBLOOD
>>>> SERUM,1STSPECIMEN
>>>> SERUM,2NDSPECIMEN
>>>> 
>>>> and I need to generate a new variable containing either "BLOOD" or
> "SERUM"
>>>> I would appreciate very much if you can give me some hints on
> solving this.
>>>> I'm using Stata 11.1 on a Windows XP machine
>>>> Kind regards,
>>>> Jorge
>>>> 
>>>> 
> 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index