The idea with replacing separators in a text editor is good, but might be quite difficult for 600 files that Denisa has. It can be automated of course, say with a macro in Word, but why not to write the whole conversion in another programming environment? The final files will not be used in Stata, will they? 2Friedrich: insheet does not solve the problem in my view, beacuse Stata will still be limited to 244 symbols for strings. 2Denisa: please specify exactly, what are the rules for commas and "|" and missings in the input file. Best regards, Sergiy On 8/24/07, Friedrich Huebler <fhuebler@gmail.com> wrote: > Denisa, > > Is your example an accurate representation of your data? If so, you > have a problem because there are no delimiters around fields with > missing data. Here is a partial answer to your question that will read > the data into Stata, but the columns won't line up. > > Step 1: Open the file in a text editor and replace all occurrences of > " comma " by "|" (without quotes). This will yield the following file: > > Row1 > Name1|Name2|Address1|Address2|PatClass1|PatClass2|PatClass3 > Row 2 > Name3|Name4|Name5|Address3|Address4|Address5|PatClass4 > > Step 2: Read the file into Stata with -insheet- > > . insheet using test.txt, delimit("|") > . clist, noobs > > v1 v2 v3 v4 v5 v6 v7 > Row1 > Name1 Name2 Address1 Address2 PatClass1 PatClass2 PatClass3 > Row 2 > Name3 Name4 Name5 Address3 Address4 Address5 PatClass4 > > Step 3: Delete the "Row" entries. > > . drop if mod(_n,2)>0 > (2 observations deleted) > > . clist, noobs > > v1 v2 v3 v4 v5 v6 v7 > Name1 Name2 Address1 Address2 PatClass1 PatClass2 PatClass3 > Name3 Name4 Name5 Address3 Address4 Address5 PatClass4 > > Step 4: Save the data as a comma-separated file. > > . outsheet using test.csv, comma > > When you open the CSV file in a text editor you see this: > > v1,v2,v3,v4,v5,v6,v7 > "Name1","Name2","Address1","Address2","PatClass1","PatClass2","PatClass3" > "Name3","Name4","Name5","Address3","Address4","Address5","PatClass4" > > Variable v3 should have a missing value in the first observation. > Instead it contains Address1. Variables v4 to v7 also contain wrong > data. I do not know how you can address this problem without > information on missing values in your original data. > > Friedrich > > On 8/23/07, Mindruta, Denisa Constanta <mindruta@uiuc.edu> wrote: > > Greetings! > > I would appreciate any help on the following problem: I need to import a (.cvs) file containing several string variables that go well beyond stata limits. Is there a way to import the file, and at the same time, parse these string variables in constituent words (delimited by "|") before saving it as a stata file ? > > > > A simple example might help: > > Row1 > > Name1|Name2 comma Address1|Address2 comma PatClass1|PatClass2|PatClass3 > > Row 2 > > Name3|Name4|Name5 comma Address3|Address4|Address5 comma PatClass4 > > > > Want to get the following structure: > > Row1 > > Name1 comma Name2 comma "missing info" comma Address1 comma Address2 comma "missing info" comma PatClass1 comma PatClass2 comma PatClass3 > > Row 2 > > Name3 comma Name4 comma Name5 comma Address3 comma Address4 comma Address5 comma PatClass4 comma "missing info" comma "missing info" > > > > Any suggestion on how to approach this problem? (here is just a simpe example, the text in a cell could go up to 200 words of 30 characters each, and I have 15 of these variables, and 600 files...)Thanks ! > > > > Denisa > > University of Illinois Urbana-Champaign > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

