Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: String variables over 244 in a dataset with two delimiters


From   "Joseph Coveney" <[email protected]>
To   <[email protected]>
Subject   st: Re: String variables over 244 in a dataset with two delimiters
Date   Tue, 20 Sep 2011 12:40:25 +0900

Adam Ozimek wrote:

I have a dataset that is tab delimited, and one of the variables is a string
that can be over 244 characters. If I read this using insheet, or inputst, or I
think anything else, it truncates this variable. However, there is an aspect of
the string variable that I hope will let me get around this: it is delimited by
semicolon. Is there a way to select one of the columns in a tab delimited
dataset, and read in by parsing it as semi-colon delimited? Is there some
otherway to rescue the long variable without the truncation?

--------------------------------------------------------------------------------

There are a couple of ways to approach this problem, but I think that the most
direct is to use Stata's -filefilter- command to convert semicolons to
double-quote + tab + double-quotes, and then read the converted file in with
-insheet-.  (To learn more about-filefilter-, see Stata's online help for the
command or see its entry in the user manual.)

Notes:

1. This assumes that your string column's contents are surrounded by
double-quotation marks.  If not, then just convert the semicolons to tabs alone.

2. If your tab-delimited file has a header row (column names), then remember to
insert a new name for your newly created column.  There are a couple of ways to
do that, too, in Stata, but again -filefilter- might be the most direct.

3. Don't overwrite your original.  (I'm not sure that -filefilter- will even
allow you to name <newfile> the same as <oldfile>, but if it does, don't do it.)


4. The converted file can be a temporary file by using -tempfile- in conjunction
with -filefilter-.  This makes the project's intermediate-file-cleanup chores
easier.

Joseph Coveney


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index