Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Datamanagement: warning when using infile with optional if


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Datamanagement: warning when using infile with optional if
Date   Tue, 28 Feb 2012 16:08:08 +0000

It's built-in to Stata that -if- tests every (potential) observation.
How else is Stata to know -- at least in this problem -- that your
test is satisfied? More to your point, adding extra code to ensure
bail-out once a line is known to be invalid would slow -infile- down
more frequently than it speeds it up: at least that's my guess.

-quietly- suppresses the little messages.

There are many ways to work with this kind of file, including deleting
lines from a copy that don't match a regular expression using any
decent text editor or scripting language before you enter Stata.

Nick

On Tue, Feb 28, 2012 at 3:41 PM,  <A.T.Wolters@lse.ac.uk> wrote:

> I am reading ASCII data with a dictionary using the command -infile-
> whilst conditioning on an variable (using -if-) that is read in the same
> time. I created a simplified example to show you what is happening:
>
> The data looks like:
> -----------------data.txt--------------
> 1Ajohn1
> 1B8724
> 2Ajane0
> 2B8625
> 3Amark1
> -----------------------------------------
>
> With dictionary file
> -----------------dctB.dct--------------
> dictionary using data.txt {
>  _column(1)     int     id      %1f     "Identifier"
>  _column(2)     str1    cat     %1s     "Category"
>  _column(3)     int     dob     %2f     "Date of Birth"
>  _column(5)     int     age     %2f     "Age"
>  }
> -----------------------------------------
>
> My aim is to read only those lines where the variable cat is equal to B.
> I do this by making use of the command
> infile using dctB if cat=="B"
>
> I do end up with the required result. Stata does a great job at
> conditioning on a variable that it is reading at the same time, however,
> it returns an error for every line where cat=="A" as it contains non
> numeric characters, where Stata expects only integers. Not only does
> this produce messy .log files (especially with thousands of lines), it
> indicates that Stata has to read every line completely which is time
> consuming and somewhat unnecessary.
>
> Does anyone have a suggestion to improve on my current method?
> Preferably one that produces readable .log files?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index