Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: substringing long, varying length text variables into individual variables


From   "Tom Trikalinos" <ttrikalin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: substringing long, varying length text variables into individual variables
Date   Wed, 2 Apr 2008 18:16:40 -0400

it would do the trick if there are no other variables AFTER the
current string ...
if you have additional variables AFTER the current string then this
would mess the vertical alignment (variable number of "|"-delimited
substrings... ).

but the post implies that this is not the case, and that this is one
example of a set of similar variables...
- or I may be wrong and tired - i had a long long day...

t

On Wed, Apr 2, 2008 at 5:57 PM, Raymond Guiteras <rpguiteras@gmail.com> wrote:
> Shouldn't -- insheet using whatever.whatever, delimit("|") -- do the trick?
>  Or am I missing something?
>
>  -----Original Message-----
>  From: owner-statalist@hsphsun2.harvard.edu
>  [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Todd Wagner
>  Sent: Wednesday, April 02, 2008 4:48 PM
>  To: statalist@hsphsun2.harvard.edu
>  Subject: st: substringing long, varying length text variables into
>  individual variables
>
>  Hi,
>
>  I have data from a publicly available database
>  (clinicaltrials.gov).  This database has a number of text variables
>  that I want to break into individual variables and I could use some help.
>
>  For example, one of the variables is called study designs.  Here are
>  some data from the study designs variable
>
>  Treatment|Randomized|Double-Blind|Placebo Control|Parallel
>  Assignment|Safety/Efficacy Study
>  Prevention|Randomized|Open Label|Active Control|Parallel
>  Assignment|Bio-equivalence Study
>  Prevention|Randomized|Double Blind (Subject, Caregiver, Investigator,
>  Outcomes Assessor)|Crossover Assignment
>  Randomized|Single Blind|Active Control|Parallel Assignment
>  Natural History|Cross-Sectional|Case Control|Prospective Study
>  Treatment|Randomized|Open Label|Active Control|Parallel
>  Assignment|Efficacy Study
>  Treatment|Randomized|Double-Blind|Placebo Control|Single Group
>  Assignment|Safety/Efficacy Study
>  Treatment|Randomized|Open Label|Placebo Control|Parallel
>  Assignment|Safety/Efficacy Study
>  Treatment|Randomized|Double-Blind|Active Control|Parallel
>  Assignment|Safety/Efficacy Study
>  Prevention|Randomized|Double-Blind|Placebo Control|Parallel
>  Assignment|Safety/Efficacy Study
>  Treatment|Randomized|Single Blind (Investigator)|Placebo
>  Control|Parallel Assignment
>  Treatment|Randomized|Open Label|Active Control|Parallel
>  Assignment|Efficacy Study
>
>  What I want to do is parse this text using the "|" into individual variables
>
>  So the first case would be
>  des1            des2            des3            des4                    des5
>  des6
>  Treatment       Randomized      Double-Blind    Placebo Control
>  Parallel
>  Assignment      Safety/Efficacy Study
>
>  I can think of a brute force way where I save this variable and my id
>  variable, change | to a comma, output as text, read the text into
>  stata as a comma separated file, and then merge it back into my
>  data.  Sounds silly, but perhaps it is the easiest.  Any other ideas?
>
>  Thanks,
>
>  Todd
>
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/support/faqs/res/findit.html
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/support/faqs/res/findit.html
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index