Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: regular expression -split string an unknown number of times

From   "Rodini, Mark" <>
To   <>
Subject   st: regular expression -split string an unknown number of times
Date   Thu, 4 Aug 2011 10:54:21 -0700


I have a simple question.  I have a list of strings representing names which lack any spaces and I'm trying to insert a space in the correct place or places to split out the names.
For example, I might have:


Which I'd like to turn into

John Paul Jones

The rule is to insert a space before any upper case letter followed by a lower case.

gen teststring = regexs(1) if regexm(var,"^([A-Z][a-z]+)")

gives the first word.  I think I could do the following to get John Paul  

gen teststring = regexs(1) + " " + regexs(2) if regexm(var,"^([A-Z][a-z]+)([A-Z][a-z]+)")

The difficulty I'm having is that the number of subnames in a string is variable.  The example above has three subnames, but I might have one with two or four, etc.  I'm not sure how to program that.

Thanks for any help.

Mark Rodini
1111 Broadway, Suite 1500
Oakland, CA  94607
510-285-1258 (direct)
510-285-1240 (main)
510-285-1245 (fax)
This e-mail and attachments may be confidential and protected by legal privilege.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the e-mail or any attachment is prohibited.  If you have received this e-mail in error, please notify us immediately by replying to the sender, and then delete this copy and the reply from your system.  Thank you for your cooperation.

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index