Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: How do I split a string variable without spaces by capital letters?


From   Haluk Vahaboglu <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: How do I split a string variable without spaces by capital letters?
Date   Mon, 19 Aug 2013 23:37:14 +0000

Nick may I ask a simple question (surely not simple to me),
I am trying to learn the secrets of Stata. For this purpose, I test on my Stata 12.1 Ubuntu-64 bit system codes posted to this list for that those might be useful in my future studies.
In this context, I run your loop with a small modification as shown below:

clear all
inp str13(v1)
"TestOne"
"ThisistestTwo"
"AndThree"
end
foreach L in `c(ALPHA)' {
        replace v1= subinstr(v1 "`L'", " `L'", .)
}

It is really a surprise to me but this did not work. Returned error:
v1"A" invalid name
r(198);

It is working in the format you posted:
clonevar v2 = v1
qui foreach L in `c(ALPHA)' {
        replace v2 = subinstr(v2, "`L'", " `L'", .)
}

 I wonder why this loop fails without "clonevar v2=v1"? I guess there is a very easy answer to this which I can not see.
Thank you in advance


Prof. Dr. Haluk Vahaboğlu
Istanbul Medeniyet
Üniversitesi, 
Göztepe Eğitim ve Araştırma Hastanesi
Enfeksiyon Hastalıkları
ve Klinik Mikrobiyoloji ABD
Dr. Erkin Caddesi  34730
Kadıköy / Istanbul  TURKIYE



> Date: Mon, 19 Aug 2013 17:33:23 +0100
> Subject: Re: st: How do I split a string variable without spaces by capital letters?
> From: [email protected]
> To: [email protected]
> 
> Along these lines you could prefix every upper-case letter with a space.
> 
> clonevar v2 = v1
> 
> qui foreach L in `c(ALPHA)' {
>         replace v2 = subinstr(v2, "`L'", " `L'", .)
> }
> 
> split v2
> 
> For c(ALPHA) see results of -creturn list-.
> 
> That doesn't presuppose just two substrings.
> 
> Nick
> [email protected]
> 
> 
> On 19 August 2013 16:36, Eric A. Booth <[email protected]> wrote:
>> <>
>> Agreed, -moss- is great for this, but also you can do this using
>> built-in string functions if you are interested, example:
>>
>> *****************!
>> clear all
>> inp str13(v1)
>> "TestOne"
>> "ThisistestTwo"
>> "AndThree"
>> end
>>
>> g v2 = reverse(v1)
>> g pos = .
>> g l = length(v1)
>> foreach x in `c(ALPHA)' {
>>    replace pos = strpos(v2, "`x'") if inlist(pos, ., 0, l)
>>   }
>> drop v2
>> g first = substr(v1, 1, l-pos)
>> g second = substr(v1, l-pos+1, l)
>> list
>> *****************!
>> EAB
>>
>>
>>
>> On Mon, Aug 19, 2013 at 10:31 AM, Robert Picard <[email protected]> wrote:
>>> You can use -moss- (available from SSC) to handle this problem. The
>>> following works with your example:
>>>
>>> moss v1, match("([A-Z][^A-Z]*)") regex
>>>
>>> The pattern indicates that you are looking for substrings that start
>>> with a capital letter (i.e [A-Z]) followed by zero or more non-capital
>>> letters (i.e. [^A-Z]*).
>>>
>>> On Mon, Aug 19, 2013 at 10:06 AM, Andrew Dickens <[email protected]> wrote:
>>>> Hi all,
>>>>
>>>> I'm currently running Stata 10, and I'm having a problem splitting a string
>>>> variable by capital letters. Elena Vidal posted something under a similar
>>>> title, http://www.stata.com/statalist/archive/2011-11/msg01195.html, but the
>>>> her problem is somewhat different than mine and I was unable to
>>>> troubleshoot.
>>>>
>>>> An example of my data is as follows:
>>>>
>>>> clear all
>>>> inp str13(v1)
>>>> "TestOne"
>>>> "ThisistestTwo"
>>>> "AndThree"
>>>> end
>>>>
>>>> The problem is the capital letter I wish to split each cell by is not
>>>> consistently placed.
>>>>
>>>> I tried splitting using this code:
>>>>
>>>> split v1, p(upper(a-z))
>>>> or
>>>> split v1, p(upper(.))
>>>>
>>>> but this just generates an identical variable to v1.
>>>>
>>>> What I would like to do is create two new variables, so the first
>>>> observation of my example would have "Test" in the first new variable and
>>>> "One" in the second new variable. Suggestions would be greatly appreciated.
>>>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/ 		 	   		  

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index