Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: importing LONG string variables


From   "Sergiy Radyakin" <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: importing LONG string variables
Date   Fri, 24 Aug 2007 14:19:32 -0400

The idea with replacing separators in a text editor is good, but might
be quite difficult for 600 files that Denisa has. It can be automated
of course, say with a macro in Word,
but why not to write the whole conversion in another programming environment?

The final files will not be used in Stata, will they?

2Friedrich:
insheet does not solve the problem in my view, beacuse Stata will
still be limited to 244 symbols for strings.

2Denisa: please specify exactly, what are the rules for commas and "|"
and missings in the input file.

Best regards, Sergiy


On 8/24/07, Friedrich Huebler <fhuebler@gmail.com> wrote:
> Denisa,
>
> Is your example an accurate representation of your data? If so, you
> have a problem because there are no delimiters around fields with
> missing data. Here is a partial answer to your question that will read
> the data into Stata, but the columns won't line up.
>
> Step 1: Open the file in a text editor and replace all occurrences of
> " comma " by "|" (without quotes). This will yield the following file:
>
> Row1
> Name1|Name2|Address1|Address2|PatClass1|PatClass2|PatClass3
> Row 2
> Name3|Name4|Name5|Address3|Address4|Address5|PatClass4
>
> Step 2: Read the file into Stata with -insheet-
>
> . insheet using test.txt, delimit("|")
> . clist, noobs
>
>   v1         v2         v3         v4         v5         v6         v7
>  Row1
> Name1      Name2   Address1   Address2  PatClass1  PatClass2  PatClass3
> Row 2
> Name3      Name4      Name5   Address3   Address4   Address5  PatClass4
>
> Step 3: Delete the "Row" entries.
>
> . drop if mod(_n,2)>0
> (2 observations deleted)
>
> . clist, noobs
>
>   v1         v2         v3         v4         v5         v6         v7
> Name1      Name2   Address1   Address2  PatClass1  PatClass2  PatClass3
> Name3      Name4      Name5   Address3   Address4   Address5  PatClass4
>
> Step 4: Save the data as a comma-separated file.
>
> . outsheet using test.csv, comma
>
> When you open the CSV file in a text editor you see this:
>
> v1,v2,v3,v4,v5,v6,v7
> "Name1","Name2","Address1","Address2","PatClass1","PatClass2","PatClass3"
> "Name3","Name4","Name5","Address3","Address4","Address5","PatClass4"
>
> Variable v3 should have a missing value in the first observation.
> Instead it contains Address1. Variables v4 to v7 also contain wrong
> data. I do not know how you can address this problem without
> information on missing values in your original data.
>
> Friedrich
>
> On 8/23/07, Mindruta, Denisa Constanta <mindruta@uiuc.edu> wrote:
> > Greetings!
> > I would appreciate any help on the following problem: I need to import a (.cvs) file containing several string variables that go well beyond stata limits. Is there a way to import the file, and at the same time, parse these string variables in constituent words (delimited by "|") before saving it as a stata file ?
> >
> > A simple example might help:
> > Row1
> > Name1|Name2 comma Address1|Address2 comma PatClass1|PatClass2|PatClass3
> > Row 2
> > Name3|Name4|Name5 comma Address3|Address4|Address5 comma PatClass4
> >
> > Want to get the following structure:
> > Row1
> > Name1 comma Name2 comma "missing info" comma Address1 comma Address2 comma "missing info" comma PatClass1 comma PatClass2 comma PatClass3
> > Row 2
> > Name3 comma Name4 comma Name5 comma Address3 comma Address4 comma Address5 comma PatClass4 comma "missing info" comma "missing info"
> >
> > Any suggestion on how to approach this problem? (here is just a simpe example, the text in a cell could go up to 200 words of 30 characters each, and I have 15 of these variables, and 600 files...)Thanks !
> >
> > Denisa
> > University of Illinois Urbana-Champaign
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index