Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: New package (-intext-) on SSC for inputting arbitrary text files

From   Roger Newson <>
Subject   st: New package (-intext-) on SSC for inputting arbitrary text files
Date   Fri, 25 Oct 2002 12:43:07 +0100

Dear All

Kit Baum has very kindly placed my -intext- package for distribution on SSC. Type -ssc desc intext- or -net search intext- to find out how to download it.

The -intext- package inputs arbitrary text files into string variables in the memory, without removing leading and trailing blanks (as -infix- does). -intext- contains 2 programs, -intext- and -tfconcat-. -intext- inputs a single text file into a list of generated string variables, generating enough variables to contain the longest input text line (including leading and trailing blanks). -tfconcat- concatenates a list of arbitrary text files into a new Stata data set in memory, overwriting any existing data. The new data set contains a list of string variables (as generated by -intext-), and also, optionally, additional variables, indicating, for each observation, its input text file of origin and/or its sequential order as a line within its input text file of origin. Therefore, -tfconcat- is like -dsconcat- (also on SSC), except that it concatenates text files instead of Stata data sets.

-intext- is an inverse of -outfile,runtogether-, because the generated string variables created by -intext- can be output using -outfile,runtogether- to produce a duplicate of the original file. Therefore, -intext- enables Stata programs to read Stata programs, just as -outfile,runtogether- enables Stata programs to write Stata programs. I developed -intext- initially so that I could use Stata as a "Perl substitute" for website maintenance, eg producing a list of all my Stata .pkg files with accompanying title lines and incorporating that list into a HTML file.

I have written -intext- intending it to work under Unix or MacOS as well as under Windows. However, if anybody ever finds that it doesn't, then please let me know.

I would like to thank Nick Cox, Nick Winter and Kit Baum for a lot of very helpful advice, which eventually pointed me in the right direction to the way to write -intext-. Nick Winter's -log2do2- program was especially useful as a practical example of the use of the -file- package for inputting binary files byte by byte into scalars instead of macros, thereby avoiding the hazards of macro quoting. (Note for StataCorp - I still think it would be useful to have functions that read macro contents directly without quoting, like my suggested -lsubstr(localmacroname,n1,n2)- and -gsubstr(globalmacroname,n1,n2)-, and maybe also a -length- extended macro function to measure a macro length.)

Best wishes


Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605

Opinions expressed are those of the author, not the institution.

* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index