[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Yu Zhang <whgyu1@yahoo.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: read text file with multiple spaces |

Date |
Fri, 19 Aug 2005 06:14:39 -0700 (PDT) |

Thanks for all the wonderful suggestions! Unfortunately, since I have multiple data files and fairly large number of variables per file, I guess I will stay with my old way. Yu --- Daniel Egan <degan.stata@gmail.com> wrote: > Hi Yu, > > Perhaps as if not more succint is to open the text > file in a text > editor, and replace every instance of " " (that was > two spaces) with > " " (one space). > > You may have to do it until a message comes up that > - -no instances of " " (2 spaces) could be found-- > > This assumes that you do not have a string variable > (for example an ID > variable) that has two spaces within it which are > meaningful, such as > "ABCD EFG HIJK". > > Thats the only caveat I can think of. > > Best, > > Dan > > On 8/19/05, Jayesh Kumar > <theindianeconomist@gmail.com> wrote: > > Since you are already working with Perl, you could > have find an easier way out. > > In this case, I'll replace spaces with "|", and > use delim in insheet command. > > In perl you could say: perl -lane r/ /\|/g > filename > > > > If you wish to do it mannually: In any text > processor I'll replace all > > consecutive spaces with "|" using find-replace > command, until all > > consecutive "|" are removed, and then insheet the > file. > > > > HTH, > > Jayesh > > =================== > > Jayesh Kumar > > > > On 8/19/05, Joseph Coveney > <jcoveney@bigplanet.com> wrote: > > > Yu Zhang wrote: > > > > > > It's a shame to ask, but does anyone know how to > read > > > data (text file) with multiple spaces between > > > variables? The number of spaces may vary, so I > cannot > > > use: > > > > > > . insheet using file, delim(" ") > > > > > > The only way I figured out is to count the > number of > > > variables first (e.g., using Perl) and then use: > > > > > > . infile var1-var# using file > > > > > > Is there a more direct way? > > > > > > > -------------------------------------------------------------------------------- > > > > > > My guess would be to do the same in Stata as you > would do in Perl to > > > identify variables. > > > > > > For example, if there is only a single space > between tokens within any > > > string > > > variable, and there are at least two spaces > (maybe more) between each pair > > > of variables, then: > > > 1. insheet into Stata into a single string > variable (mind the limit for > > > string variable length), > > > 2. use Stata's limited regular expressions > capability to convert multiple > > > spaces to a convenient delimiter (choose one not > otherwise present in the > > > string variables' data), > > > 3. convert multiple delimiters to single > delimiters (mind blank cells), > > > 4. export the delimited dataset as an ASCII > spreadsheet from Stata (using > > > the -no quote- option) to a temporary file, and > then > > > 5. re-import the delimited spreadsheet into > Stata. > > > > > > Joseph Coveney > > > > > > * Creating demonstration spreadsheet > > > clear > > > set more off > > > set obs 3 > > > generate str var1 = "column1 column2 > column3" > > > replace var1 = /// > > > "This is the first column. This is the second > column. " /// > > > + "This is the third column." in 2 > > > replace var1 = /// > > > "The first-second is two spaces. " /// > > > + "The second-third is four spaces. " in 3 > > > * Check these last lines above--they might have > line-wrapped > > > * in the e-mail handler. > > > outsheet using > space_delimited_text_spreadsheet.prn, noname noquote > > > clear > > > * > > > * Begin here > > > * > > > insheet using > space_delimited_text_spreadsheet.prn > > > replace v1 = subinstr(v1, " ", "; ", .) > > > replace v1 = subinstr(v1, "; ; ", "; ", .) > > > tempfile tmpfil0 > > > outsheet using `tmpfil0', nonames noquote > > > insheet using `tmpfil0', names delimiter(";") > clear > > > erase `tmpfil0' > > > list, clean > > > exit > > > > > > > > > * > > > * For searches and help try: > > > * > http://www.stata.com/support/faqs/res/findit.html > > > * http://www.stata.com/support/statalist/faq > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > > * > > * For searches and help try: > > * > http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > * > * For searches and help try: > * > http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > ____________________________________________________ Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: read text file with multiple spaces***From:*Daniel Egan <degan.stata@gmail.com>

- Prev by Date:
**Re: st: read text file with multiple spaces** - Next by Date:
**Re: st: cluster() and robust option** - Previous by thread:
**Re: st: read text file with multiple spaces** - Next by thread:
**RE: st: read text file with multiple spaces** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |