Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: importing LONG string variables


From   "Mindruta, Denisa Constanta" <mindruta@uiuc.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: importing LONG string variables
Date   Fri, 24 Aug 2007 15:03:24 -0500 (CDT)

The delimiter between names and addresses is "tab" (well, it was "comma" in the previous message but it's easy to change it to "tab", which is less confusing) ; the delimiter among address (or names) themselves is "|". Because the number of addresses (and names) varies across rows, I will need a program to read across rows, count the number of "|" between two tabs and create as many new columns as the max number of "|" (+1) across rows. Then put everything between two "|" in one of the new columns. I guess I will have to do this in a different program/text editor, unless someone suggests a solution. 
---- Original message ----
>Date: Fri, 24 Aug 2007 15:38:18 -0400
>From: "Friedrich Huebler" <fhuebler@gmail.com>  
>Subject: Re: st: importing LONG string variables  
>To: statalist@hsphsun2.harvard.edu
>
>Sergiy Radyakin <serjradyakin@gmail.com> wrote:
>> The idea with replacing separators in a text editor is good, but might
>> be quite difficult for 600 files that Denisa has. It can be automated
>> of course, say with a macro in Word, but why not to write the whole
>> conversion in another programming environment?
>
>A good text editor (see "Some notes on text editors for Stata users",
>http://fmwww.bc.edu/repec/bocode/t/textEditors.html) or a program like
>Windows Grep can process many files in one go without the need for
>macros or additional programming.
>
>Mindruta, Denisa Constanta <mindruta@uiuc.edu> wrote:
>> Friedrich, you got right to the point: but, unfortunately, with the
>> method that you suggested I won't be able to align the information
>> properly. Is there any way of telling Stata to leave a blank on V3
>> Row1 and move Address1 in V4?
>
>I do not see how this is possible given the format of your original
>data. Can you define a rule that identifies which fields should be
>left blank? How should Stata distinguish between names and addresses,
>for example? Even a human would sometimes have difficulties to know
>which string is part of a name and which string part of an address.
>Take this line:
>
>Denisa,Constanta,Mindruta,Street
>
>Does this mean that the name of the person is Denisa Constanta
>Mindruta, living in a street with a missing name? Or is this a person
>by the name Denisa Constanta, with the third name missing, living in
>Mindruta Street? To be unambiguous, your original data should contain
>one of the following lines, with missing values clearly identified by
>the delimiters.
>
>Denisa,Constanta,Mindruta,,Street
>Denisa,Constanta,,Mindruta,Street
>
>Friedrich
>*
>*   For searches and help try:
>*   http://www.stata.com/support/faqs/res/findit.html
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index