Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Inputting arbitrary text files into Stata datasets


From   Roger Newson <roger.newson@kcl.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Inputting arbitrary text files into Stata datasets
Date   Thu, 17 Oct 2002 14:46:10 +0100

At 13:43 17/10/02 +0100, Nick Cox wrote:

I am very curious about why you want to read columns of
program code into string variables. If I wanted to
process code or package files as text, I would do
it in a text editor or scripting language.
The specific application that got me thinking along these lines was website maintenance. In my website (under construction at http://www.kcl-phs.org.uk/rogernewson/ ), I have multiple .htm files, .toc files, .pkg files, .zip files, .ado files and .do files. When I add packages to my website, I run a Stata program that reads all the .pkg files, collects the title line in each one, and generates lists of packages in the .toc files and in tables in the .htm files. A package such as -intext- would make it easier to add further bells and whistles. However, there are probably many other file-bashing applications where this approach would be less bother, and more reliable, than a mixture of manual text editing and DOS scripts.


I think quotes are easier than you think. Compound double
quotes don't do any harm beyond adding some visual complexity
to what you read. If this is not true for you, there's a bug
somewhere.
See Nick Winter's second reply re the ` (left prime) character.

The limitations of -file- match my understanding. Similar issues
arise in other contexts.

Nick Winter has recently tackled the issue of stripping
commands out of log files to produce the equivalent .do files
and he used -file- to read in logs as if they were binary
files, byte by byte.

Kit Baum and I wrote a wrapper -log2html-
to facilitate translation of SMCL files
to HTML, using -file- to read in logs line by line,
but our program does have the undocumented limitation that it
won't treat uses of local and global macros
correctly.
This is precisely the problem I am trying to get round. Primes mess up macro quoting, which also converts \\ , \$ and \` character pairs to \ , $ and ` characters, respectively. And I don't write many Stata programs without macros. Presumably, however, the macros used in a Stata program are stored somewhere, in their pre-quoted form.

Best wishes

Roger


--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: roger.newson@kcl.ac.uk

Opinions expressed are those of the author, not the institution.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index