Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st:How to input a portion of a file


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st:How to input a portion of a file
Date   Mon, 18 Feb 2008 18:25:19 +0000

No, think about it: a user-written program for input would have to be written at a fairly low level, as how could you e.g. refer to variables that don't yet exist? You need something that reads in a line at a time. It could be done in Mata, it could be done with a plug-in, and it could be done with -file-, possibly, but a user-programmer would have to re-create a lot of tedious but essential parsing of lines, consistency checking, and so forth. Not to mention flushing buffers and similar horrors.

I think it is much, much easier to go the obvious way -- to filter the file so that what is left behind is then amenable to one of the input commands. You are resisting that conclusion, but I think it is inescapable.

In Stata, this seems to call for something using -file-.

In Unix -- and by extension in Windows too, as ports aplenty can be found -- this seems to call for something like sed, awk, or perl.

Here is a demonstration program. Unix adepts will love to point out how this is very long-winded compared with one of the utilities mentioned above.

To see how it works, create a -test.txt- with lines 1 to 7 containing
those numbers.

striplines test.txt, outfile(test2.txt) keep(3/7)
type test2.txt
striplines test.txt, outfile(test2.txt) replace drop(1/3)
type test2.txt

*-------------------------------- striplines.ado
*! 1.0.0 NJC 18 Feb 2008
* strip lines from ASCII file
* must specify -out-
* must specify -keep()- or -drop()-
* may specify -replace-
program striplines
version 8.2
syntax anything(name=infile) , outfile(str) ///
[replace keep(numlist) drop(numlist) ]

if "`keep'" != "" & "`drop'" != "" {
di as err "specify keep() or drop()"
exit 198
}
if "`keep'" == "" & "`drop'" == "" {
di as err "specify keep() or drop()"
exit 198
}

// filenames and handles
tempname hi ho
file open `hi' using `"`infile'"', r
file open `ho' using `"`outfile'"', w `replace'

local i = 1

if "`keep'" != "" {
file read `hi' line
while r(eof) == 0 {
local tokeep : list i in keep
if `tokeep' file write `ho' `"`line'"' _n
file read `hi' line
local ++i
}
}
else {
file read `hi' line
while r(eof) == 0 {
local todrop : list i in drop
if !`todrop' file write `ho' `"`line'"' _n
file read `hi' line
local ++i
}
}

file close `ho'
di _n `"`outfile' created"'
end

Nick
n.j.cox@durham.ac.uk

Joseph Wagner
=============

I can get the file into excel and the columns line up perfectly. If I
open the file in Crimson editor the columns appear to be tab-delimited
after all (apparently why I was able to use -insheet-). That said, is
there a user-written program that I have missed that will perform
-insheet- like action but with options limiting the data?


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index