Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: substringing long, varying length text variables into individual variables


From   "Raymond Guiteras" <rpguiteras@gmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: substringing long, varying length text variables into individual variables
Date   Wed, 2 Apr 2008 17:57:04 -0400

Shouldn't -- insheet using whatever.whatever, delimit("|") -- do the trick?
Or am I missing something?

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Todd Wagner
Sent: Wednesday, April 02, 2008 4:48 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: substringing long, varying length text variables into
individual variables

Hi,

I have data from a publicly available database 
(clinicaltrials.gov).  This database has a number of text variables 
that I want to break into individual variables and I could use some help.

For example, one of the variables is called study designs.  Here are 
some data from the study designs variable

Treatment|Randomized|Double-Blind|Placebo Control|Parallel 
Assignment|Safety/Efficacy Study
Prevention|Randomized|Open Label|Active Control|Parallel 
Assignment|Bio-equivalence Study
Prevention|Randomized|Double Blind (Subject, Caregiver, Investigator, 
Outcomes Assessor)|Crossover Assignment
Randomized|Single Blind|Active Control|Parallel Assignment
Natural History|Cross-Sectional|Case Control|Prospective Study
Treatment|Randomized|Open Label|Active Control|Parallel 
Assignment|Efficacy Study
Treatment|Randomized|Double-Blind|Placebo Control|Single Group 
Assignment|Safety/Efficacy Study
Treatment|Randomized|Open Label|Placebo Control|Parallel 
Assignment|Safety/Efficacy Study
Treatment|Randomized|Double-Blind|Active Control|Parallel 
Assignment|Safety/Efficacy Study
Prevention|Randomized|Double-Blind|Placebo Control|Parallel 
Assignment|Safety/Efficacy Study
Treatment|Randomized|Single Blind (Investigator)|Placebo 
Control|Parallel Assignment
Treatment|Randomized|Open Label|Active Control|Parallel 
Assignment|Efficacy Study

What I want to do is parse this text using the "|" into individual variables

So the first case would be
des1		des2		des3		des4			des5
des6
Treatment	Randomized	Double-Blind	Placebo Control
Parallel 
Assignment	Safety/Efficacy Study

I can think of a brute force way where I save this variable and my id 
variable, change | to a comma, output as text, read the text into 
stata as a comma separated file, and then merge it back into my 
data.  Sounds silly, but perhaps it is the easiest.  Any other ideas?

Thanks,

Todd

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index