Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Breaking one string variable into several new variables |

Date |
Thu, 25 Feb 2010 04:47:07 +0530 |

Although I think your problem could be much better solved by importing carefully (see code below for hints as to how this might work), but in case you are stuck with data of the kind of show, here is how you might recover the original data. From the way your example data has wrapped, I am guessing that you have tabs separating variables. If not, please let me know: ************************************** clear* input var1 str20 var2 var3 str20 var4 var5 var6 1 "a b" 100 "c d" 2000 .1 1 "a b" 100 "c d" 2000 .1 1 "a b" 100 "c d" 2000 .1 1 "a b" 100 "c d" 2000 .1 1 "a b" 100 "c d" 2000 .1 end outsheet * using exampledata.txt, noquote replace insheet using exampledata.txt, comma nonames clear li, clean split v1, g(new_) parse(`=char(9)') destring // rename from first row foreach x of varlist new_* { local newname = `x' in 1 rename `x' `newname' } drop in 1 drop v1 li, noobs ************************************** T 2010/2/25 Anna Rakhman <amr0084@gmail.com>: > Dear Statalist, > > I have the following issue I was hoping you could help with. I've imported > data from a .txt file and no matter how I import it, I always end up with > one variable while I really need 6 different variables. > > This is what my file now looks like now (this is the first 4 observations of > variable v1, the only variable in the dataset): > > industry1 industry1_def industry2 > industry2_def year value > 1 oilseed farming 100 > cotton farming 2000 .1 > 2 logging 200 > iron ore mining 2000 .2 > 3 blah and blah and blah 300 > yata, yata 2000 .3 > > This is a made-up example, but as you can see, the problem is that each > column should be a separate variable. > > I've tried using gen split1=(v1,1), gen split2=(v1,-1) and gen > split3=(v1,-2) to get industr1, value, and year as separate variables, but > I'm not sure how to get industry2 as a separate variable because it is not a > fixed number of words from either end of the string. > > Any suggestions? > > Thanks! > Anna > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- To every ω-consistent recursive class κ of formulae there correspond recursive class signs r, such that neither v Gen r nor Neg(v Gen r) belongs to Flg(κ) (where v is the free variable of r). * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**AW: st: Breaking one string variable into several new variables***From:*"Martin Weiss" <martin.weiss1@gmx.de>

**References**:**st: Breaking one string variable into several new variables***From:*Anna Rakhman <amr0084@gmail.com>

- Prev by Date:
**Re: st: Breaking one string variable into several new variables** - Next by Date:
**st: SUR in Long Format** - Previous by thread:
**Re: st: Breaking one string variable into several new variables** - Next by thread:
**AW: st: Breaking one string variable into several new variables** - Index(es):