Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Need to Split String Variable

From   Nick Cox <>
To   "" <>
Subject   Re: st: Need to Split String Variable
Date   Thu, 23 May 2013 19:13:44 +0100

With these problems it helps to have different tools to hand.

Here is a crude way to do it. The advantage of this method should be
that it is easy to understand in principle.

The algorithm is

loop until we get to the first character:
       look at the last character
      if it's a letter remove it

and naturally we need to look at every observation. So, how to do that in Stata?

We can be crude about this.  Stata will be faster doing almost nothing
than we are (I am, strictly) at writing code. (May not apply to Bill

Let's assume Mike has a -str20-. If it's some other length, as it will
be, change 20.

clonevar mycopy = mystrvar

qui forval j = 1/20 {
     replace  mycopy = substr(mycopy, 1, length(mycopy) - 1) if
inrange(upper(substr(mycopy, -1, 1)), "A", "Z")

That should be it.

Going through it again

substr(mycopy, -1, 1)

is the last character.

upper(<that>) maps "a" .. "z" to "A" .. "Z"

and so forth.

I don't know what Mike wants to do with the "." before the "D" in his
second example.

P.S. You could try -moss- (SSC). Robert Picard has probably already
written that post.


On 23 May 2013 18:50, Michael Stewart <> wrote:
> I am working with ICD9 codes as some of the codes are wrongly coded
> 444.44AD
> V45.45.D
> The goal is to remove the trailing alphabets BUT not leading alphabet
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index