Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Stata's character encoding

From	Billy Schwartz <[email protected]>
To	[email protected]
Subject	st: Stata's character encoding
Date	Mon, 23 Jul 2012 12:03:00 -0400

I'm trying to generate automatically some Stata scripts from an
external program* that by default encodes all text files at UTF-8.
Best I can tell, Stata uses whatever character encoding is native to
the platform it's on (e.g., Windows-1252 on Windows) which means the
only portable character encoding is plain ASCII (no characters with
code points above 127) for reading Do-files and spreadsheet data and
is flexible about whether line endings are LF or CRLF (but must be
consistent within a given file -- I've had problems loading
spreadsheet data that were CRLF line-terminated but randomly
distributed CRs throughout the data made Stata think there were line
endings where there weren't.)

Is this a correct characterization of the way Stata reads text files?
If not, what's the most portable way for me to encode text for both Do
files and spreadsheet data?

-----------
*I'm writing Python scripts to write Stata scripts because I expect my
input data to change several times and I don't want to have to hand
rewrite Stata code each time the underlying data changes. I find
examining directory structures and reading non-tabular data (in this
case, the record layouts for the data I'm working with) easier to
express in Python than in Stata. I'm open to suggestions on best ways
to deal with this, but since I've got it mostly written, that's not
the main goal of this email.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Using mi estimate with xtmixed and accounting for sampling weights
Next by Date: RE: st: Why is Mata much slower than MATLAB at matrix inversion?
Previous by thread: st: Problem with parsing inputs to ado files
Next by thread: st: Question concerning Bootstrapping in Stata
Index(es):
- Date
- Thread