Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: New package (-intext-) on SSC for inputting arbitrary text files


From   Roger Newson <roger.newson@kcl.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   st: New package (-intext-) on SSC for inputting arbitrary text files
Date   Fri, 25 Oct 2002 12:43:07 +0100

Dear All

Kit Baum has very kindly placed my -intext- package for distribution on SSC. Type -ssc desc intext- or -net search intext- to find out how to download it.

The -intext- package inputs arbitrary text files into string variables in the memory, without removing leading and trailing blanks (as -infix- does). -intext- contains 2 programs, -intext- and -tfconcat-. -intext- inputs a single text file into a list of generated string variables, generating enough variables to contain the longest input text line (including leading and trailing blanks). -tfconcat- concatenates a list of arbitrary text files into a new Stata data set in memory, overwriting any existing data. The new data set contains a list of string variables (as generated by -intext-), and also, optionally, additional variables, indicating, for each observation, its input text file of origin and/or its sequential order as a line within its input text file of origin. Therefore, -tfconcat- is like -dsconcat- (also on SSC), except that it concatenates text files instead of Stata data sets.

-intext- is an inverse of -outfile,runtogether-, because the generated string variables created by -intext- can be output using -outfile,runtogether- to produce a duplicate of the original file. Therefore, -intext- enables Stata programs to read Stata programs, just as -outfile,runtogether- enables Stata programs to write Stata programs. I developed -intext- initially so that I could use Stata as a "Perl substitute" for website maintenance, eg producing a list of all my Stata .pkg files with accompanying title lines and incorporating that list into a HTML file.

I have written -intext- intending it to work under Unix or MacOS as well as under Windows. However, if anybody ever finds that it doesn't, then please let me know.

I would like to thank Nick Cox, Nick Winter and Kit Baum for a lot of very helpful advice, which eventually pointed me in the right direction to the way to write -intext-. Nick Winter's -log2do2- program was especially useful as a practical example of the use of the -file- package for inputting binary files byte by byte into scalars instead of macros, thereby avoiding the hazards of macro quoting. (Note for StataCorp - I still think it would be useful to have functions that read macro contents directly without quoting, like my suggested -lsubstr(localmacroname,n1,n2)- and -gsubstr(globalmacroname,n1,n2)-, and maybe also a -length- extended macro function to measure a macro length.)

Best wishes

Roger

--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: roger.newson@kcl.ac.uk

Opinions expressed are those of the author, not the institution.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index