Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: if, in using issues with ado files: syntax bug?


From   Kevin Geraghty <[email protected]>
To   [email protected]
Subject   Re: st: RE: if, in using issues with ado files: syntax bug?
Date   Mon, 12 Mar 2012 16:19:48 -0700 (PDT)

Thanks, Nick.

It still feels like unexpected and illogical behavior to me. The syntax permits one to combine "if" and "using", or "in" and "using".  Given the current syntax-checker  behavior, I cannot think of *any* circumstances where it in fact makes sense to write a command with both "if" and "using", or both "in" and "using". So why permit it? Same deal with combining a varlist with "using". 

note also that the stata builtin "describe" takes both a varlist and a "using" clause, and it behaves in accordance with my intuition, not with the syntax enforced on user-written ado files. That is, the validity of the varlist is checked against the "using" dataset, not against the one in memory. In truth though, a cursory sweep does not reveal instances of stata built-ins which combine "if" or "in" with "using", and "describe" is the only example I stumbled across of varlist + "using"



 

----- Original Message -----
From: "Nick Cox" <[email protected]>
To: [email protected]
Sent: Monday, March 12, 2012 1:40:19 PM
Subject: Re: st: RE: if, in using issues with ado files: syntax bug?

This is an example. It's not an elegant program, but then I wouldn't
write a program to do what Kevin is doing. (Very likely he wouldn't
either; he's presumably just concocting an example to make a point.)

program kg_maxval, rclass
version 12
syntax name [using] [, if(str) in(str) ]

preserve
if `"`using'"' != "" {
      use `using', clear
}

local 0 `name' `if' `in'
syntax varname [if] [in]

collapse (max) `varlist' `in' `if'
return local max = `varlist'[1]
end

On Mon, Mar 12, 2012 at 6:37 PM, Nick Cox <[email protected]> wrote:

> What you can also do is pass your -if- and -in- conditions as options.
> That way, they can used when you need them.
>
> Nick
>
> On Mon, Mar 12, 2012 at 6:26 PM, Nick Cox <[email protected]> wrote:
>> This isn't a bug in -syntax-. It's intended behaviour, or a result of it. It's -syntax-'s job to compare
>>
>> (1) what is supplied as if it were a varlist
>>
>> with
>>
>> (2) what variables are in memory when the program in question is called
>>
>> _if and only if_
>>
>> a varlist is also specified to -syntax- as part of what should or could be supplied. It's doing what you asked.
>>
>> Note that there's no rule that -syntax- must be fed varlist as one of the elements of a legal program call.
>>
>> What is in -using()-'s argument is a different question. Of course, there are occasions like this when what you care about what is in the using file, but you have to program accordingly.
>>
>> What you need here, I think, is a namelist.
>>
>> Nick
>> [email protected]
>>
>> Kevin Geraghty
>>
>> consider the following example .ado file, which uses standard stata syntax:
>>
>>
>> program define maxval, rclass
>>    syntax varname [if] [in] [using]
>>    version 12
>>
>>    preserve
>>        if `"`using'"' !=`""' {
>>            use `using', clear
>>        }
>>
>>        collapse (max) `varlist' `in' `if'
>>        return local max = `varlist'[1]
>>    restore
>>
>> end
>>
>>
>> It is supposed to return the max of a specified numeric variable, subject to any "if" or "in"
>>  conditions, in the specified "using" data set or the current data set if no "using" data set is specified.
>>
>> Problem is, it doesn't work as intended because the syntax checker checks the validity of the varlist, and of the "if" and "in" conditions, against the in-memory data set rather than the "using" data set.
>>
>> That is, suppose you have a data set on disk called "test.dta", containing a numeric variable "fu". If you invoke your ado file
>>  "maxval fu using test" it will throw an error complaining that "fu" is not defined.  If, by pure chance, you have an in-memory data set with the variable "fu" defined, the code will work as intended, that is the program's return value will be the max of "fu" in test.dta, not the max of in-memory "fu". The same is true of the "in" and "if" qualifiers, that is, suppose your in-memory data set contains ten records, but "test.dta" contains 50.  If you invoke "maxval fu in 30/50 using test" it will throw an error complaining about observation numbers out of range.
>>
>> This strikes me as incorrect behavior. As a stata bug. Does anybody disagree? Am I missing something?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index