Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: if, in using issues with ado files: syntax bug?

From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: if, in using issues with ado files: syntax bug?
Date   Mon, 12 Mar 2012 18:26:25 +0000

This isn't a bug in -syntax-. It's intended behaviour, or a result of it. It's -syntax-'s job to compare 

(1) what is supplied as if it were a varlist 


(2) what variables are in memory when the program in question is called 

_if and only if_ 

a varlist is also specified to -syntax- as part of what should or could be supplied. It's doing what you asked. 

Note that there's no rule that -syntax- must be fed varlist as one of the elements of a legal program call. 

What is in -using()-'s argument is a different question. Of course, there are occasions like this when what you care about what is in the using file, but you have to program accordingly. 

What you need here, I think, is a namelist. 

[email protected] 

Kevin Geraghty

consider the following example .ado file, which uses standard stata syntax:

program define maxval, rclass
    syntax varname [if] [in] [using]
    version 12

        if `"`using'"' !=`""' {
            use `using', clear

        collapse (max) `varlist' `in' `if'
        return local max = `varlist'[1]


It is supposed to return the max of a specified numeric variable, subject to any "if" or "in"
 conditions, in the specified "using" data set or the current data set if no "using" data set is specified.

Problem is, it doesn't work as intended because the syntax checker checks the validity of the varlist, and of the "if" and "in" conditions, against the in-memory data set rather than the "using" data set.

That is, suppose you have a data set on disk called "test.dta", containing a numeric variable "fu". If you invoke your ado file
 "maxval fu using test" it will throw an error complaining that "fu" is not defined.  If, by pure chance, you have an in-memory data set with the variable "fu" defined, the code will work as intended, that is the program's return value will be the max of "fu" in test.dta, not the max of in-memory "fu". The same is true of the "in" and "if" qualifiers, that is, suppose your in-memory data set contains ten records, but "test.dta" contains 50.  If you invoke "maxval fu in 30/50 using test" it will throw an error complaining about observation numbers out of range.

This strikes me as incorrect behavior. As a stata bug. Does anybody disagree? Am I missing something?

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index