Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Importing text files using different separation rules


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Importing text files using different separation rules
Date   Fri, 20 Jan 2012 15:25:49 +0000

More or less the same issue was aired a few months ago by Adam Ozimek in 

http://www.stata.com/statalist/archive/2011-09/msg00781.html

Several answers in that thread may be helpful. 

In essence, you cannot read data into Stata [not STATA] as string variables if any field is more than 244 characters long. That is an absolute in present Stata. Otherwise put, Stata's str244 limit can't be worked round, within Stata, using variables.  What _you_ regard as a string variable cuts no ice if this limit is violated. 

Similarly, you can't split (or -split-) a too long variable in Stata _as a variable_ because you can't even it read it in as a too long variable. You can of course concatenate variables shorter than 244 characters so long as the result is no longer than 244. 

So, the only Stata-based solutions appear to involve something other than variables to start with.  In that thread various styles of solutions were discussed, including using -file- to read in and then use macro manipulation; using -filefilter- to create separation characters; and using Mata similarly. 

For example, in various posts on 22 September I showed three Mata-based programs. 

What is common to all is using Stata (including Mata) to prepare a modified file before you can read the data in as variables. 

Nick 
n.j.cox@durham.ac.uk 

Seliger Florian

I want to import text files with STATA, but the problem is that some string variables are too long (too many strings).

Therefore, I'm searching for a way to tell STATA that it shall split these variables according to certain rules (the separation character for these variables must be different from other variables which are not too long) when importing the files. STATA only needs to apply these rules to certain variables and not to all.

Principally, I would import the too long variables first, then doing a match with other variables in the second step. Unfortunately, "insheet" does not import single variables (STATA will report an error message).


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index