" split v1, g(new_) parse(`=char(9)') destring"

The -destring- option cannot have any effect at this point since the first row of the dataset is full of "var" strings. So it is probably better to -destring, replace- towards the end of the code.

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Tirthankar Chakravarty
Gesendet: Donnerstag, 25. Februar 2010 00:17
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: Breaking one string variable into several new variables

Although I think your problem could be much better solved by importing carefully (see code below for hints as to how this might work), but in case you are stuck with data of the kind of show, here is how you might recover the original data. From the way your example data has wrapped, I am guessing that you have tabs separating variables. If not, please let me know:

**************************************
clear*
input var1 str20 var2 var3 str20 var4 var5 var6
1 "a b" 100 "c d" 2000 .1
1 "a b" 100 "c d" 2000 .1
1 "a b" 100 "c d" 2000 .1
1 "a b" 100 "c d" 2000 .1
1 "a b" 100 "c d" 2000 .1
end

outsheet * using exampledata.txt, noquote replace
insheet using exampledata.txt, comma nonames clear
li, clean

split v1, g(new_) parse(`=char(9)') destring

// rename from first row
foreach x of varlist new_* {
local newname = `x' in 1
rename `x' `newname'
}
drop in 1
drop v1
li, noobs
**************************************

T

2010/2/25 Anna Rakhman <amr0084@gmail.com>:
> Dear Statalist,
>
> I have the following issue I was hoping you could help with. I've imported
> data from a .txt file and no matter how I import it, I always end up with
> one variable while I really need 6 different variables.
>
> This is what my file now looks like now (this is the first 4 observations of
> variable v1, the only variable in the dataset):
>
> industry1 industry1_def industry2
> industry2_def year value
> 1 oilseed farming 100
> cotton farming 2000 .1
> 2 logging 200
> iron ore mining 2000 .2
> 3 blah and blah and blah 300
> yata, yata 2000 .3
>
> This is a made-up example, but as you can see, the problem is that each
> column should be a separate variable.
>
> I've tried using gen split1=(v1,1), gen split2=(v1,-1) and gen
> split3=(v1,-2) to get industr1, value, and year as separate variables, but
> I'm not sure how to get industry2 as a separate variable because it is not a
> fixed number of words from either end of the string.
>
> Any suggestions?
>
> Thanks!
> Anna

--
To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

