[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Joseph Coveney <jcoveney@bigplanet.com> |

To |
Statalist <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: read text file with multiple spaces |

Date |
Fri, 19 Aug 2005 13:10:43 +0900 |

Yu Zhang wrote: It's a shame to ask, but does anyone know how to read data (text file) with multiple spaces between variables? The number of spaces may vary, so I cannot use: . insheet using file, delim(" ") The only way I figured out is to count the number of variables first (e.g., using Perl) and then use: . infile var1-var# using file Is there a more direct way? -------------------------------------------------------------------------------- My guess would be to do the same in Stata as you would do in Perl to identify variables. For example, if there is only a single space between tokens within any string variable, and there are at least two spaces (maybe more) between each pair of variables, then: 1. insheet into Stata into a single string variable (mind the limit for string variable length), 2. use Stata's limited regular expressions capability to convert multiple spaces to a convenient delimiter (choose one not otherwise present in the string variables' data), 3. convert multiple delimiters to single delimiters (mind blank cells), 4. export the delimited dataset as an ASCII spreadsheet from Stata (using the -no quote- option) to a temporary file, and then 5. re-import the delimited spreadsheet into Stata. Joseph Coveney * Creating demonstration spreadsheet clear set more off set obs 3 generate str var1 = "column1 column2 column3" replace var1 = /// "This is the first column. This is the second column. " /// + "This is the third column." in 2 replace var1 = /// "The first-second is two spaces. " /// + "The second-third is four spaces. " in 3 * Check these last lines above--they might have line-wrapped * in the e-mail handler. outsheet using space_delimited_text_spreadsheet.prn, noname noquote clear * * Begin here * insheet using space_delimited_text_spreadsheet.prn replace v1 = subinstr(v1, " ", "; ", .) replace v1 = subinstr(v1, "; ; ", "; ", .) tempfile tmpfil0 outsheet using `tmpfil0', nonames noquote insheet using `tmpfil0', names delimiter(";") clear erase `tmpfil0' list, clean exit * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: read text file with multiple spaces***From:*"Donald Spady" <dspady@ualberta.ca>

**Re: st: read text file with multiple spaces***From:*Jayesh Kumar <theindianeconomist@gmail.com>

- Prev by Date:
**Re: st: read text file with multiple spaces** - Next by Date:
**Re: st: read text file with multiple spaces** - Previous by thread:
**st: seemingly unrelated regression with Tobit** - Next by thread:
**Re: st: read text file with multiple spaces** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |