Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: break string into multiple variables


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: break string into multiple variables
Date   Thu, 11 Oct 2012 09:57:43 +0100

There is no such option, but why deserves a comment. (A little of the
history of -split- is documented at [D] split.)

-split- is an official command focused on one problem: splitting a
string according to separators or parse characters. In the easiest
case. the separators are spaces, so that the substrings are words in
Stata's sense. In "2 3 5 7 11 13", the separate integers are words
sensu Stata just as surely as the words in "Stata rules OK" and "Queen
rules UK" (to disinter briefly a popular British idiom from the
1970s).

The separators could be something else: e.g. "@" separates distinct
parts of email addresses.

That splitting problem basically calls for a loop over words: each
distinct word goes into a distinct variable, but it was a popular
question on Statalist several years ago. Essentially that was what led
to -split-, with -strparse- (M. Blasnik and myself, SSC) a step on the
way.

Another splitting problem is the kind illustrated by Sophie, in which
there are no separator characters. The answer is always to use
-substr()- as many times as are needed and you do have to spell out
starting positions always and lengths often.

To come to the point: I thought at some length about adding some
functionality to -split- to cover this second kind of problem but
decided it would just add complication to the syntax. It seemed easy
enough and in many ways better to emphasise using -substr()- directly
as the answer. But that was, and is, a judgement call. At the back of
my mind was always the Unix precept that a program should do one thing
well.

Nick

On Thu, Oct 11, 2012 at 1:53 AM, Steve Nakoneshny <scnakone@ucalgary.ca> wrote:
> Sophie,
>
> At first glance, I would suggest looking at -split-, but only prior to using -destring-. There might be an option for splitting each character into a new var.
>
> Steve
>
> Sent via carrier pigeon
>
> On 2012-10-10, at 5:33 PM, "Sophie Moullin" <sophiemoullin@gmail.com> wrote:
>
>> Dear all,
>>
>> I've got a number of string variables which contain answers for up to
>> 13 questions within the broad variable category, which when destrung
>> are listed as one single numeric, e.g. 2222112222 (1 is yes, 2 is no).
>> I need to create separate variables for each question 1-13.
>>
>> I haven't been able to find any example code for this on any of the
>> stata documentation or FAQ. Any advice would be much appreciated.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index